LLM reasoning is becoming a powerful tool for mathematics, planning, decision support, and even agents acting on your behalf in the world. However, LLMs can make mistakes, suggest (or perform) unsafe actions, and in general are not always that reliable. In this work, we consider the problem of interactively training verifiers that could interface with these generators to make them more reliable, safer, and better aligned with their user’s preferences. We consider this problem from a learning-theoretic perspective, analyzing the types of guarantees achievable for learning capable verifiers for these tasks, and highlighting the different roles of “soundness” versus “completeness” forms of errors in our analysis. This work is joint with Maria-Florina Balcan, Korinna Fragkia, Zhiyuan Li, and Dravyansh Sharma.
Avrim Blum is Professor, Chief Academic Officer, and Interim President at the Toyota Technological Institute at Chicago (TTIC). Before TTIC, he was a faculty member in the Computer Science Department at Carnegie Mellon University for 25 years. Blum’s main research interests are in Machine Learning Theory, Algorithmic Game Theory, Approximation Algorithms, and Societal Issues in Machine Learning. He has served as Program Chair for the Conference on Learning Theory (COLT), the IEEE Symposium on Foundations of Computer Science (FOCS), and the Innovations in Theoretical Computer Science Conference (ITCS). Blum is recipient of the AI Journal Classic Paper Award, the ICML/COLT 10-Year Best Paper Award, the ACM Paris Kanellakis Award, the Sloan Fellowship, the NSF National Young Investigator Award, and the Herbert Simon Teaching Award, and he is a Fellow of the ACM.
Audience
- Faculty/Staff
- Student
- Post Docs/Docs
- Graduate Students
Contact
Amani Walker
Email
Interest
- Academic (general)
- Sciences