CS Seminar: Reinforcement Learning and Control with Generative World Models (Arindam Banerjee)

Name: CS Seminar: Reinforcement Learning and Control with Generative World Models (Arindam Banerjee)
Start: 2026-04-13T12:00:00-05:00
End: 2026-04-13T13:00:00-05:00
Location: Mudd Hall ( formerly Seeley G. Mudd Library), 3514

Monday, April 13, 2026 | 12:00 PM - 1:00 PM CT

Mudd Hall ( formerly Seeley G. Mudd Library), 3514, 2233 Tech Drive, Evanston, IL 60208 map it

Add to Calendar

Monday / CS Seminar
April 13 / 12:00 PM
Hybrid / Mudd 3514

Speaker
Arindam Banerjee, University of Illinois Urbana-Champaign

Talk Title
Reinforcement Learning and Control with Generative World Models

Abstract

"Recent years have witnessed remarkable advances in generative modeling — from diffusion models and flow matching to autoregressive transformers and action-conditioned video models — that are rapidly closing the gap between learned simulators and the complexity of real-world dynamics. These developments open a principled path toward a new generation of reinforcement learning (RL) algorithms that harness the representational power of generative world models, naturally bridging model-based planning and model-free policy optimization within a unified framework.

In this talk, we introduce an inference-time policy optimization framework inspired by model predictive control (MPC), built around a pretrained policy and a learned world model (WM) of state transitions and rewards. While existing approaches use learned dynamics to generate imagined trajectories — either during training or at inference — they stop short of using those trajectory rollouts to optimize policy parameters on the fly. Our approach addresses this gap through a Differentiable World Model (DWM) pipeline that enables end-to-end gradient computation through WM trajectory rollouts, yielding inference-time policy optimization (ITPO) grounded in MPC. Across continuous-control benchmarks, ITPO with DWM consistently outperforms strong offline RL baselines. Beyond the core RL framework, we also discuss principled approaches to fine-tuning generative models under distribution shift, which enable the online deployment of such world-model-based policies."

Biography
Arindam Banerjee is a Founder Professor at the Siebel School of Computing and Data Science, University of Illinois Urbana-Champaign. He currently serves as the President of the Society for Artificial Intelligence and Statistics which runs the annual international AISTATS conference. He is an ACM Fellow. His research interests are in machine learning and artificial intelligence. His current research focuses on computational and statistical aspects of deep learning, spatial and temporal data analysis, generative models, and sequential decision making. His work also focuses on applications of machine learning in complex real-world and scientific domains including problems in weather and climate, ecology, and agriculture. He has won several awards over the years, including the NSF CAREER award, the IBM Faculty Award, and seven best paper awards at top-tier venues.

Research Area/Interest:
Machine Learning, Artificial Intelligence

---
Zoom: TBA
Panopto: TBA

Cost: free

Audience

Faculty/Staff
Student
Post Docs/Docs
Graduate Students

Contact

Wynante R Charles
(847) 467-8174
Email

Group

Department of Computer Science (CS)

Interest

Academic (general)

CS Seminar: Reinforcement Learning and Control with Generative World Models (Arindam Banerjee)

Audience

Contact

Group

Interest

Add Event To My Group