Name: Statistics and Data Science Seminar: "An Automatic Finite-Sample Robustness Check: Can Dropping a Little Data Change Conclusions?"
Start: 2024-05-17T11:00:00-05:00
End: 2024-05-17T12:00:00-05:00
Location: Chambers Hall, Ruan Conference Room – lower level

Northwestern Events Calendar

May

2024

Statistics and Data Science Seminar: "An Automatic Finite-Sample Robustness Check: Can Dropping a Little Data Change Conclusions?"

When: Friday, May 17, 2024
11:00 AM - 12:00 PM CT

Where: Chambers Hall, Ruan Conference Room – lower level, 600 Foster St, Evanston, IL 60208 map it

Audience: Faculty/Staff - Student - Post Docs/Docs - Graduate Students

Cost: free

Contact: Kisa Kowal (847) 491-3974

Group: Department of Statistics and Data Science

Category: Academic, Lectures & Meetings

Description:

An Automatic Finite-Sample Robustness Check: Can Dropping a Little Data Change Conclusions?

Tamara Broderick, Associate Professor, Department of Electrical Engineering and Computer Science at MIT

Abstract: Practitioners will often analyze a data sample with the goal of applying any conclusions to a new population. For instance, if economists conclude microcredit is effective at alleviating poverty based on observed data, policymakers might decide to distribute microcredit in other locations or future years. Typically, the original data is not a perfect random sample from the population where policy is applied -- but researchers might feel comfortable generalizing anyway so long as deviations from random sampling are small, and the corresponding impact on conclusions is small as well. Conversely, researchers might worry if a very small proportion of the data sample was instrumental to the original conclusion. So we propose a method to assess the sensitivity of statistical conclusions to the removal of a very small fraction of the data set. Manually checking all small data subsets is computationally infeasible, so we propose an approximation based on the classical influence function. Our method is automatically computable for common estimators. We provide finite-sample error bounds on approximation performance and a low-cost exact lower bound on sensitivity. We find that sensitivity is driven by a signal-to-noise ratio in the inference problem, does not disappear asymptotically, and is not decided by misspecification. Empirically we find that many data analyses are robust, but the conclusions of several influential economics papers can be changed by removing (much) less than 1% of the data.

More Info Add to Calendar

Add Event To My Group:

Please sign-in