When:
Friday, November 7, 2025
11:00 AM - 12:00 PM CT
Where: Chambers Hall, Ruan Conference Room – lower level, 600 Foster St, Evanston, IL 60208 map it
Audience: Faculty/Staff - Student - Post Docs/Docs - Graduate Students
Cost: free
Contact:
Kisa Kowal
(847) 491-3974
Group: Department of Statistics and Data Science
Category: Academic, Lectures & Meetings
Statistical Inference for Temporal Difference Learning with Linear Function Approximation
Alessandro Rinaldo, Professor, Department of Statistics and Data Sciences, The University of Texas at Austin
Abstract:
Policy evaluation is a fundamental task in Reinforcement Learning (RL), with applications in numerous fields, such as clinical trials, mobile health, robotics, and autonomous driving. Temporal Difference (TD) learning and its variants are arguably the most widely used algorithms for policy evaluation with linear approximation. Despite the popularity and practical importance of TD estimators of the parameters of the best linear approximation to the value function, theories and methods for formal statistical inference with finite sample validity in high dimensions remain limited. Consequently, RL practitioners often lack essential statistical tools to guide their decision-making. To address this gap, we develop efficient inference procedures for TD learning-based estimators under linear function approximation in on-policy settings. We obtain improved consistency rates and derive novel high-dimensional Berry-Esseen bounds for the TD estimator under independent samples and Markovian trajectories. Additionally, we propose an online algorithm to construct non-asymptotic confidence intervals for the target parameters.
Joint work with Weichen Wu (Voleon) and Yuting Wei (UPenn).