Name: Statistics and Data Science Seminar: "Structure-driven design of reinforcement learning algorithms: a tale of two estimators"
Start: 2024-11-22T11:00:00-06:00
End: 2024-11-22T12:00:00-06:00
Location: Chambers Hall, Ruan Conference Room – lower level

Northwestern Events Calendar

Nov

2024

Statistics and Data Science Seminar: "Structure-driven design of reinforcement learning algorithms: a tale of two estimators"

When: Friday, November 22, 2024
11:00 AM - 12:00 PM CT

Where: Chambers Hall, Ruan Conference Room – lower level, 600 Foster St, Evanston, IL 60208 map it

Audience: Faculty/Staff - Student - Post Docs/Docs - Graduate Students

Cost: free

Contact: Kisa Kowal (847) 491-3974

Group: Department of Statistics and Data Science

Category: Academic, Lectures & Meetings

Description:

Structure-driven design of reinforcement learning algorithms: a tale of two estimators

Wenlong Mou, Assistant Professor of Statistical Sciences, University of Toronto

Abstract: Reinforcement learning (RL) offers a flexible framework for sequential decision-making in uncertain environments, and its success heavily depends on efficiently learning value functions. Over the years, a diverse range of RL algorithms has been proposed, but at their core, two foundational principles stand out: to solve the Bellman fixed-point equations (known as ``bootstrapping methods''), or to average the rollout rewards. Despite their success, finding the optimal trade-off between these principles in practical applications remains elusive. Current theoretical guarantees -- either worst-case or asymptotic -- often fall short of providing actionable insights.

In this talk, I will discuss recent advances in methods that optimally reconcile bootstrapping and rollout for policy evaluation. The bulk of this talk will focus on a new class of estimators that strikes an optimal balance between temporal difference learning and Monte Carlo methods. Through the statistical lens, I will highlight why the local structure of the underlying Markov chain determines the fundamental complexity for estimation, and how our estimator adapts to these structures. Extending this perspective to continuous-time RL, I will also explore how the elliptic structure of diffusion processes provides key insights for making algorithmic choices.

More Info Add to Calendar

Add Event To My Group:

Please sign-in