Name: IEMS Seminar: A New One-Point Oracle for Derivative-Free Optimization and Learning
Start: 2022-10-04T11:00:00-05:00
End: 2022-10-04T12:00:00-05:00

Northwestern Events Calendar

Oct

2022

IEMS Seminar: A New One-Point Oracle for Derivative-Free Optimization and Learning

When: Tuesday, October 4, 2022
11:00 AM - 12:00 PM CT

Where: Online
Webcast Link

Audience: Faculty/Staff - Student - Public - Post Docs/Docs - Graduate Students

Contact: Agnes Kaminski (847) 491-3576

Group: Department of Industrial Engineering and Management Sciences (IEMS)

Category: Lectures & Meetings

Description:

Michael M. Zavlanos, Ph.D.

Yoh Family Professor

Dept. of Mechanical Engineering & Materials Science

Dept. of Electrical & Computer Engineering

Dept. of Computer Science

Duke University

Title: A New One-Point Oracle for Derivative-Free Optimization and Learning

Abstract: Derivative-Free (or zeroth-order) optimization methods enable the optimization of black-box models that are available only in the form of input-output data and are common in training of Deep Neural Networks and RL. In the absence of input-output models, exact first or second order information (gradient or hessian) is unavailable and can not be used for optimization. Therefore, zeroth-order methods rely on input-output data to obtain approximations of the gradients that can be used as descent directions. In this talk, we present a new one-point policy gradient estimator that we have recently developed that requires a single function evaluation at each iteration to estimate the gradient, by using the residual between two consecutive feedback points. We refer to this scheme as residual feedback. We show that residual feedback can be used to develop Multi-Agent Reinforcement Learning (MARL) methods with partial state and action observations, as it allows the agents to compute the local policy gradients needed to update their local policy functions using local estimates of the global accumulated rewards. We also analyze the performance of the proposed residual feedback estimator for online learning, where one-point policy gradient estimation is the only viable choice. We show that, in both MARL and online learning, residual feedback induces a smaller estimation variance than other one-point feedback methods and, therefore, improves the learning rate.

Bio: Michael M. Zavlanos is the Yoh Family Professor of the Department of Mechanical Engineering and Materials Science at Duke University, Durham, NC. He also holds a secondary appointment in the Department of Electrical and Computer Engineering and the Department of Computer Science. Currently, he is also an Amazon Scholar with Amazon Robotics, North Reading, MA. His research focuses on control theory, optimization, and learning and AI with applications in robotics and autonomous systems, cyber-physical systems, and healthcare/medicine. He is a recipient of various awards including the 2014 Office of Naval Research Young Investigator Program (YIP) Award and the 2011 National Science Foundation Faculty Early Career Development (CAREER) Award.

Michael M. Zavlanos received the Diploma in mechanical engineering from the National Technical University of Athens (NTUA), Athens, Greece, in 2002, and the M.S.E. and Ph.D. degrees in electrical and systems engineering from the University of Pennsylvania, Philadelphia, PA, in 2005 and 2008, respectively.

Add to Calendar

Add Event To My Group:

Please sign-in