Big data collected from the internet have generated significant interest in not only the academic community but also industry and government agencies. They bring great potential in tracking and predicting massive social activities. We focus on tracking disease epidemics in this talk. We will discuss the applications, in particular, Google Flu Trends, some of the fallacy and the statistical implications. We will propose a new model that utilizes publicly available online data to estimate disease epidemics. Our model outperforms all previous real-time tracking models for influenza epidemics at the national level of the US. An extended version of the model gives accurate tracking of Dengue fever in Asian and South American countries. We will also draw some lessons for big data applications.
Samuel Kou is Professor of Statistics at Harvard University. He received a bachelor's degree in computational mathematics from Peking University in 1997, followed by a Ph.D. in statistics from Stanford University in 2001. After completing his Ph.D., he joined Harvard University as an Assistant Professor of Statistics. He was promoted to a full professor in 2008.
His research interests include stochastic inference in biophysics, chemistry and biology; protein folding; big data analytics; digital disease tracking; Bayesian inference for stochastic models; nonparametric statistical methods; model selection and empirical Bayes methods; and Monte Carlo methods.
He is the recipient of the COPSS (Committee of Presidents of Statistical Societies) Presidents' Award; the Guggenheim Fellowship; a US National Science Foundation CAREER Award; the Institute of Mathematical Statistics Richard Tweedie Award; the Raymond J. Carroll Young Investigator Award; and the American Statistical Association Outstanding Statistical Application Award. He is an elected Fellow of the American Statistical Association, an elected member of the International Statistical Institute, and an elected Fellow and a Medallion Lecturer of the Institute of Mathematical Statistics.
Today's finance industry is continuously searching for alpha, through more advanced modeling and through alternative sources of data. Many alternative sources of data are not raw numerical data but structured and unstructured documents - tables, graphics and text. In this talk, we present ML methods for extracting information from documents. We also present a case study of how such information can be combined with market data for better predictive modeling.
David Rosenberg is a data scientist in the data science group in the Office of the CTO at Bloomberg, and an adjunct associate professor at the Center for Data Science at New York University, where he has repeatedly received NYU's Center for Data Science "Professor of the Year" award. He received his Ph.D. in statistics from UC Berkeley, where he worked on statistical learning theory and natural language processing. David received a Master of Science in applied mathematics, with a focus on computer science, from Harvard University, and a Bachelor of Science in mathematics from Yale University.
Amanda Stent is a NLP architect in the data science group in the office of the CTO at Bloomberg LP. Previously, she was a director of research and principal research scientist at Yahoo Labs, a principal member of technical staff at AT&T Labs - Research, and an associate professor in the Computer Science Department at Stony Brook University. Her research interests center on natural language processing and its applications, in particular topics related to text analytics, discourse, dialog and natural language generation. She holds a PhD in computer science from the University of Rochester. She is co-editor of the book Natural Language Generation in Interactive Systems (Cambridge University Press), has authored over 90 papers on natural language processing and is co-inventor on over twenty patents and patent applications.
Several examples of how statistical methodology was used to improve patient care and prostate cancer control outcomes will be discussed.
Anthony D'Amico is the Eleanor Theresa Walters Distinguished Chair, Chief of Genitourinary Radiation Oncology at the Dana-Farber Cancer Institute and Brigham and Women's Hospital, Chair of the residency executive committee in the Harvard Radiation Oncology Program, and Advisory Dean and Chair of career advising and mentorship at Harvard medical School.
Dr. D'Amico is an internationally known expert in the treatment of prostate cancer and has defined combined modality staging, which is used to select patients with localized prostate cancer for specific surgical or radiotheraputic treatment options. He is the principal investigator of several federally funded grants that support his investigations in Image Guided Therapy for early stage prostate cancer, drug development for advanced stage prostate cancer, and clinical trials that are aimed at defining future management strategies for men with prostate cancer.
Dr. D'Amico holds two undergraduate and three graduate degrees: a B.S. in physics, a B.S. in nuclear engineering, M.S. in nuclear engineering, and a Ph.D. in radiation physics, all from the Massachusetts Institute of Technology, and an M.D. from the University of Pennsylvania School Of Medicine. He completed his residency in the department of radiation oncology of the Hospital of the University of Pennsylvania in Philadelphia, where he served as chief resident during his final year. He has over 300 peer reviewed original publications and editorials, and his teaching contributions include his position as Co-director of the Harvard Combined Residency in Radiation Oncology, Master of the Oliver Wendell Holmes Society at Harvard Medical School, editorial board member of 6 scientific journals, expert reviewer for 14 Journals including the New England Journal of Medicine and the Journal of the American Medical Association and editor of four textbooks on the management of prostate cancer.
Dr. D'Amico has been awarded the Best Doctor in America award annually since 2009 for his work in prostate cancer, is the 2012 Harvard Medical School Arnold P. Gold Awardee for Humanism in Medicine and the recipient of the Harvard Medical School Class of 2014, 2015 Career Advising and Mentoring Faculty Teaching Award in addition to the 2015-2016 Morton M. Kligerman Award provided by the Hospital of the University of PA. In December of 2016 his editorial on a landmark prostate cancer study in the New England Journal of Medicine was cited in The New Yorker as one of the most notable medical findings in 2016.