Name: Algorithms Seminar: Dravy Sharma on "Configuring neural networks and tuning gradient descent hyperparameters"
Start: 2025-09-23T13:00:00-05:00
End: 2025-09-23T14:00:00-05:00
Location: Mudd Hall ( formerly Seeley G. Mudd Library), 3514

Northwestern Events Calendar

Sep

2025

Algorithms Seminar: Dravy Sharma on "Configuring neural networks and tuning gradient descent hyperparameters"

When: Tuesday, September 23, 2025
1:00 PM - 2:00 PM CT

Where: Mudd Hall ( formerly Seeley G. Mudd Library), 3514, 2233 Tech Drive, Evanston, IL 60208 map it

Audience: Student - Graduate Students

Contact: Chang Wang

Group: Department of Computer Science (CS)

Category: Academic, Lectures & Meetings

Description:

Abstract: Modern deep learning algorithms involve careful hyperparameter tuning to achieve the best performance. With large amounts of similar tasks and related data available, pre-trained foundation models are ubiquitous, but can we also “pre-train” the hyperparameters? A key challenge is that the network’s performance is a volatile function of the hyperparameters, and the usual assumptions of convexity and smoothness fail to apply. We consider two broad classes of hyperparameters — model hyperparameters that are part of the network architecture and optimization hyperparameters that only impact the training procedure. We develop novel techniques for bounding the pseudo-dimension of a function class involving optimization of piecewise-polynomial functions, implying sample complexity guarantees for configuring model hyperparameters like neural network activation functions. We further show how to provably tune optimization hyperparameters like learning rate and learning schedules, extending the results of Gupta and Roughgarden (2016) beyond smooth, convex regimes to practical non-convex settings including neural networks.

Bio: Dravyansh (Dravy) Sharma is an IDEAL postdoctoral researcher, hosted by Avrim Blum at TTIC and Aravindan Vijayaraghavan at Northwestern University. He obtained his PhD at Carnegie Mellon University, advised by Nina Balcan. His research interests include machine learning theory and algorithms, with a focus on provable hyperparameter tuning, adversarial robustness, and learning in the presence of rational agents. His work develops principled techniques for tuning fundamental machine learning algorithms to domain-specific data, including decision trees, linear regression, graph-based learning and, most recently, deep networks. He has published several papers at top ML venues, including NeurIPS, ICML, COLT, JMLR, AISTATS, UAI and AAAI, has multiple papers awarded with Oral presentations, won the Outstanding Student Paper Award at UAI 2024, and has interned with Google Research and Microsoft Research. He has presented a tutorial at UAI 2025 and an accepted joint tutorial at NeurIPS 2025.

Add to Calendar

Add Event To My Group:

Please sign-in