Northwestern Events Calendar


WED@NICO SEMINAR: Momin Malik, Harvard University "Revisiting 'All Models are Wrong': Addressing Limitations in Big Data, Machine Learning, and Computational Social Science"

Momin Malik

When: Wednesday, February 5, 2020
12:00 PM - 1:00 PM  

Where: Chambers Hall, Lower Level, 600 Foster St, Evanston, IL 60208 map it

Audience: Faculty/Staff - Student - Public - Post Docs/Docs - Graduate Students

Cost: Free

Contact: Meghan Stagl   847.491.2527

Group: Northwestern Institute on Complex Systems (NICO)

Category: Academic



Momin Malik - Data Science Postdoctoral Fellow, Berkman Klein Center for Internet & Society, Harvard University


Revisiting "All Models are Wrong": Addressing Limitations in Big Data, Machine Learning, and Computational Social Science


In the immortal words of George E. P. Box (1979), "All models are wrong, but some are useful." This is an important lesson to recall amidst hopes and claims that digital trace data, the high-dimension and low-assumption models of machine learning, and advancements in computational social science are overcoming the limitations of the past. In this talk, I review the fundamental limitations with which all quantitative research must grapple, and discuss how these limitations manifest today.

Larger data captures more heterogeneity and allows for studying finer and finer subpopulations and phenomena, but as I demonstrate with geotagged tweets, selection bias still makes results fail to generalize to larger populations. The platforms from which we gather data are not research utilities, and I model the introduction of Facebook's "People You May Know" recommender system to show how social media platforms' efforts to solicit desirable behavior from users changes what we think we observe. Through considering co-location data via mobile phone sensors versus friendship self-report, I consider how new forms of measurement do not necessarily supersede previous forms but capture different underlying constructs that can be fruitful opportunities for research. I conclude with a theoretical overview of limitations of forms of quantitative modeling, from the inevitable reliance on central tendencies in probability-based modeling through to how cross-validation can break down in the presence of dependencies.

This talk will serve as a useful overview about modeling limitations and critiques, as well as possible fixes, for researchers in and practitioners of data science, computational social science, social physics, statistics, and machine learning. It will also be useful as a primer for those outside these fields on the appropriate and inappropriate uses of techniques from them.

Speaker Bio:

Momin M. Malik is the Data Science Postdoctoral Fellow at the Berkman Klein Center for Internet & Society at Harvard University. He holds an undergraduate degree in history of science from Harvard, a master’s from the Oxford Internet Institute, and a master's in Machine Learning and a PhD in Societal Computing from the School of Computer Science, Carnegie Mellon University, where his dissertation measured how much social media platform effects, demographic biases, and reliance on mobile phone sensor data can threaten generalizability of findings in computational social science. His current work bridges machine learning and science studies to understand the sources of both success and failures in machine learning. 

About the Speaker Series:

Wednesdays@NICO is a vibrant weekly seminar series focusing broadly on the topics of complex systems and data science. It brings together attendees ranging from graduate students to senior faculty who span all of the schools across Northwestern, from applied math to sociology to biology and every discipline in-between. 

Live Stream:

More Info Add to Calendar

Add Event To My Group:

Please sign-in