When:
Friday, February 27, 2026
11:00 AM - 12:00 PM CT
Where: Chambers Hall, Ruan Conference Room – lower level, 600 Foster St, Evanston, IL 60208 map it
Audience: Faculty/Staff - Student - Post Docs/Docs - Graduate Students
Cost: free
Contact:
Kisa Kowal
(847) 491-3974
k-kowal@northwestern.edu
Group: Department of Statistics and Data Science
Category: Academic, Lectures & Meetings
What functions does XGBoost learn?
Aditya Guntuboyina, Associate Professor, Department of Statistics, University of California, Berkeley
Abstract: We develop a theoretical framework that explains what kinds of functions XGBoost is able to learn. We introduce an infinite-dimensional function class that extends ensembles of shallow decision trees, along with a natural measure of complexity that generalizes the regularization penalty built into XGBoost. We show that this complexity measure aligns with classical notions of variation—in one dimension it corresponds to total variation, and in higher dimensions it is closely tied to a well-known concept called Hardy–Krause variation. We prove that the best least-squares estimator within this class can always be represented using a finite number of trees, and that it achieves a nearly optimal statistical rate of convergence, avoiding the usual curse of dimensionality. Our work provides the first rigorous description of the function space that underlies XGBoost, clarifies its relationship to classical ideas in nonparametric estimation, and highlights an open question: does the actual XGBoost algorithm itself achieve these optimal guarantees? This is joint work with Dohyeong Ki at UC Berkeley.