Big data collected from the internet have generated significant interest in not only the academic community but also industry and government agencies. They bring great potential in tracking and predicting massive social activities. We focus on tracking disease epidemics in this talk. We will discuss the applications, in particular, Google Flu Trends, some of the fallacy and the statistical implications. We will propose a new model that utilizes publicly available online data to estimate disease epidemics. Our model outperforms all previous real-time tracking models for influenza epidemics at the national level of the US. An extended version of the model gives accurate tracking of Dengue fever in Asian and South American countries. We will also draw some lessons for big data applications.
Samuel Kou is Professor of
Statistics at Harvard University. He received a bachelor's degree in
computational mathematics from Peking University in 1997, followed by
a Ph.D. in statistics from Stanford University in 2001. After
completing his Ph.D., he joined Harvard University as an Assistant
Professor of Statistics. He was promoted to a full professor in 2008.
His research interests include stochastic inference in
biophysics, chemistry and biology; protein folding; big data
analytics; digital disease tracking; Bayesian inference for stochastic
models; nonparametric statistical methods; model selection and
empirical Bayes methods; and Monte Carlo methods.
He is the
recipient of the COPSS (Committee of Presidents of Statistical
Societies) Presidents' Award; the Guggenheim Fellowship; a US National
Science Foundation CAREER Award; the Institute of Mathematical
Statistics Richard Tweedie Award; the Raymond J. Carroll Young
Investigator Award; and the American Statistical Association
Outstanding Statistical Application Award. He is an elected Fellow of
the American Statistical Association, an elected member of the
International Statistical Institute, and an elected Fellow and a
Medallion Lecturer of the Institute of Mathematical Statistics.
Today's finance industry is continuously searching for alpha, through more advanced modeling and through alternative sources of data. Many alternative sources of data are not raw numerical data but structured and unstructured documents - tables, graphics and text. In this talk, we present ML methods for extracting information from documents. We also present a case study of how such information can be combined with market data for better predictive modeling.
David Rosenberg is a data scientist in the data science group in the Office of the CTO at Bloomberg, and an adjunct associate professor at the Center for Data Science at New York University, where he has repeatedly received NYU's Center for Data Science "Professor of the Year" award. He received his Ph.D. in statistics from UC Berkeley, where he worked on statistical learning theory and natural language processing. David received a Master of Science in applied mathematics, with a focus on computer science, from Harvard University, and a Bachelor of Science in mathematics from Yale University.
Amanda Stent is a NLP architect in the data science group in the office of the CTO at Bloomberg LP. Previously, she was a director of research and principal research scientist at Yahoo Labs, a principal member of technical staff at AT&T Labs - Research, and an associate professor in the Computer Science Department at Stony Brook University. Her research interests center on natural language processing and its applications, in particular topics related to text analytics, discourse, dialog and natural language generation. She holds a PhD in computer science from the University of Rochester. She is co-editor of the book Natural Language Generation in Interactive Systems (Cambridge University Press), has authored over 90 papers on natural language processing and is co-inventor on over twenty patents and patent applications.
Several examples of how statistical methodology was used to improve patient care and prostate cancer control outcomes will be discussed.
Anthony D'Amico is the
Eleanor Theresa Walters Distinguished Chair, Chief of Genitourinary
Radiation Oncology at the Dana-Farber Cancer Institute and Brigham and
Women's Hospital, Chair of the residency executive committee in the
Harvard Radiation Oncology Program, and Advisory Dean and Chair of
career advising and mentorship at Harvard medical School.
Dr. D'Amico is an internationally known expert in the treatment of
prostate cancer and has defined combined modality staging, which is
used to select patients with localized prostate cancer for specific
surgical or radiotheraputic treatment options. He is the principal
investigator of several federally funded grants that support his
investigations in Image Guided Therapy for early stage prostate
cancer, drug development for advanced stage prostate cancer, and
clinical trials that are aimed at defining future management
strategies for men with prostate cancer.
Dr. D'Amico holds
two undergraduate and three graduate degrees: a B.S. in physics, a
B.S. in nuclear engineering, M.S. in nuclear engineering, and a Ph.D.
in radiation physics, all from the Massachusetts Institute of
Technology, and an M.D. from the University of Pennsylvania School Of
Medicine. He completed his residency in the department of radiation
oncology of the Hospital of the University of Pennsylvania in
Philadelphia, where he served as chief resident during his final year.
He has over 300 peer reviewed original publications and editorials,
and his teaching contributions include his position as Co-director of
the Harvard Combined Residency in Radiation Oncology, Master of the
Oliver Wendell Holmes Society at Harvard Medical School, editorial
board member of 6 scientific journals, expert reviewer for 14 Journals
including the New England Journal of Medicine and the Journal of the
American Medical Association and editor of four textbooks on the
management of prostate cancer.
Dr. D'Amico has been awarded
the Best Doctor in America award annually since 2009 for his work in
prostate cancer, is the 2012 Harvard Medical School Arnold P. Gold
Awardee for Humanism in Medicine and the recipient of the Harvard
Medical School Class of 2014, 2015 Career Advising and Mentoring
Faculty Teaching Award in addition to the 2015-2016 Morton M.
Kligerman Award provided by the Hospital of the University of PA. In
December of 2016 his editorial on a landmark prostate cancer study in
the New England Journal of Medicine was cited in The New Yorker as one
of the most notable medical findings in 2016.