Name: Statistics and Data Science Seminar: "Statistical Inference for Temporal Diﬀerence Learning with Linear Function Approximation"
Start: 2025-11-07T11:00:00-06:00
End: 2025-11-07T12:00:00-06:00
Location: Chambers Hall, Ruan Conference Room – lower level

Northwestern Events Calendar

Nov

2025

Statistics and Data Science Seminar: "Statistical Inference for Temporal Diﬀerence Learning with Linear Function Approximation"

When: Friday, November 7, 2025
11:00 AM - 12:00 PM CT

Where: Chambers Hall, Ruan Conference Room – lower level, 600 Foster St, Evanston, IL 60208 map it

Audience: Faculty/Staff - Student - Post Docs/Docs - Graduate Students

Cost: free

Contact: Kisa Kowal (847) 491-3974
k-kowal@northwestern.edu

Group: Department of Statistics and Data Science

Category: Academic, Lectures & Meetings

Description:

Statistical Inference for Temporal Diﬀerence Learning with Linear Function Approximation

Alessandro Rinaldo, Professor, Department of Statistics and Data Sciences, The University of Texas at Austin

Abstract:

Policy evaluation is a fundamental task in Reinforcement Learning (RL), with applications in numerous fields, such as clinical trials, mobile health, robotics, and autonomous driving. Temporal Difference (TD) learning and its variants are arguably the most widely used algorithms for policy evaluation with linear approximation. Despite the popularity and practical importance of TD estimators of the parameters of the best linear approximation to the value function, theories and methods for formal statistical inference with finite sample validity in high dimensions remain limited. Consequently, RL practitioners often lack essential statistical tools to guide their decision-making. To address this gap, we develop efficient inference procedures for TD learning-based estimators under linear function approximation in on-policy settings. We obtain improved consistency rates and derive novel high-dimensional Berry-Esseen bounds for the TD estimator under independent samples and Markovian trajectories. Additionally, we propose an online algorithm to construct non-asymptotic confidence intervals for the target parameters.

Joint work with Weichen Wu (Voleon) and Yuting Wei (UPenn).

More Info Add to Calendar

Add Event To My Group:

Please sign-in