Statathon will be held jointly by the Department of Biostatistics and the Department of Mathematics and Statistics at Boston University and New England Statistical Society in the 36th NESS Symposium (June 3, 2023 – June 6, 2023). Statathon is a statistical data science invention marathon. Anyone who has an interest in data science can attend Statathon to approach a real world data science problem, some of which are local, in new and innovative ways. It emphasizes the statistical aspects (insight, interpretation, significance, etc.) of data science problems that are often overlooked in many hackathons.


30 March, 2023,

Registration Opens

Online registration has started! Fill out this form to sign up. (Data sets to be released early April.)

8 May, 2023,

Registration Closes

Teams or individual participants should register by this deadline. Online registration will close at 11:59 pm EDT.

22 May, 2023,

Submission Deadline

Deadline for teams to submit their work for the panelist to review. Submission will close at 11:59 pm EDT.

29 May, 2022,


Finalist teams are selected and notified.

5 June, 2023,


Finalist teams present to the review panel in the 36th NESS symposium, virtually. The presentation is tentatively scheduled for 2:00 pm – 5:00 pm EDT on June 5, 2023 (Monday).

00 TBD


Awards to winning teams at the closing ceremony.

Themes and Data

Theme 1: Fraud Detection

Imagine you work for Travelers Insurance Company's fraud detection department as a modeler. Your colleagues, who are unfamiliar with statistics, would like you to create a predictive model based on historical insurance claim data. Your team is concerned about fraud detection accuracy as well as the key drivers that cause fraudulence. For this case competition, your group is tasked with identifying first-party physical damage fraudulence and explaining the indicators of fraudulent claims.

For more details about this theme, please register as a team or register to join a team for the Statathon, and we will send you a link to work on this challenge through Kaggle. The top 5 teams will be invited to give a virtual presentation of their solution (15 minutes) and answer questions from the judges, who will determine the winning teams.

(Data sets are synthetic, provided by Travelers)

More information available here

Theme 2: Did the Customer Take Action after the Alert?

HSB offers equipment breakdown insurance and provides Internet of Things (IoT) sensor solutions to help customers avoid or reduce damage to their building, equipment, or machinery. These sensors monitor essential health and performance variables that alert customers to take corrective action before critical system failure occurs.

Using these sensors, HSB has created a customer alert program in which an alert (in the form of text, email, etc.) is sent to a customer upon detection of abnormal sensor activity. In order to determine the success and usefulness of this alert program, HSB must record whether or not a customer took corrective action after receiving an alert. Unfortunately, direct follow-up with all customers is infeasible due to the large number of deployed sensors as well as potential customer unresponsiveness, thus an automated system is needed to deduce if a customer took appreciable corrective action.

In this challenge, your team will be given time series data of a custom system health metric, which consists of 25 total alerts. Using whatever methodology you see fit, develop a model that determines if an insured took action to mitigate poor operating conditions after receiving an alert.

For a full description of the dataset, please see this link

A brief data tutorial can also be found here



All teams should register online. If you already have a team or want to participate as an individual, please register using the following link.

Registration form for teams or individual participants.

Each team may have up to four team members, and only one registration form should be submit by each team with all names of the team members.

Report submission

All teams should submit their work by the deadline (May 29, 2023 11:59 pm EDT). Teams are encouraged to create a Git repository (e.g., Bitbucket, GitHub, or GitLab) to host their source code and data information. However, this is not a review factor in the competition.

Team presentations

Ten teams (five from each theme) will be selected in the finalist, and they are invited to give a team presentation to the review panels in the afternoon or evening of June 5, 2023. Each team will have 20 minutes to present their findings and products.


Who can participate?

Students from universities and high schools can participate. We will not distinguish high school students, undergraduate students, and graduate students among participants.

Do I have to pay to participate?

No. Participation is free for Statathon. We will select five finalist teams from each theme to come and present the day before the 36th NESS symposium.

Will the presentation take place in person?

To be determined. Presentations may well be hybrid this year.

How big can a team be?

Each team can have up to 4 participants.

How can I form a team?

Participants can form teams (or work individually) among peer students with common interests and/or complementary expertise. Remember a participant can be a member of only one team.

When can I start working on the problem?

You can start working on the problem when the data sets are released. We are working to release the data sets early April.

What programming language can I use?

You can use any programming language or software packages.

Will there be prizes?

Yes! There will be cash prizes for 1st, 2nd and 3rd place teams for both themes ranging from $100 to $300 dollars.

Where can I find the data?

We will let you know once the data sets are made available.

When do I need to finalize my team?

Teams must be finalized no later than May 8 when the registration closes.

Can a professor or another professional act as a team mentor?

Yes, a professor or another professional can act as a team mentor. However, a mentor is not a participant and therefore cannot implement any work for the team. In addition, no one on the organizing committee or the refereeing committee is allowed to supervise any participating team or individual.

Contact Info


Patrick Buckley, Travelers

Nathan Lally, Hartford Steam Boiler (Munich Re Group)

Kelly Li, Travelers

Daeyoung Lim (Co-Chair), University of Connecticut

Peng Xiao, Hartford Steam Boiler (Munich Re Group)

Adnan Smajic, Travelers

Masanao Yajima (Chair), Boston University


For any further questions, please send them to


Copyright ©, Department of Statistics, University of Connecticut, All rights reserved