Skip to main content

Statistics and Data Science Seminar: "Feature learning and "the linear representation hypothesis" for monitoring and steering LLMs"

Friday, April 17, 2026 | 11:00 AM - 12:00 PM CT
Chambers Hall, Ruan Conference Room – lower level, 600 Foster St, Evanston, IL 60208 map it

Feature learning and "the linear representation hypothesis" for monitoring and steering LLMs

Mikhail Belkin, HDSI Endowed Chair Professor in AI, Halicioglu Data Science Institute, University of California San Diego

Abstract: A trained Large Language Model (LLM) contains much of human knowledge. Yet, it is difficult to gauge the extent or accuracy of that knowledge, as LLMs do not always ``know what they know'' and may even be unintentionally or actively misleading. In this talk I will discuss feature learning introducing Recursive Feature Machines—a powerful method originally designed for extracting relevant features from tabular data. I will demonstrate how this technique enables us to detect and precisely guide LLM behaviors toward almost any desired concept by manipulating a single fixed vector in the LLM activation space.

Cost: free

Audience

  • Faculty/Staff
  • Student
  • Post Docs/Docs
  • Graduate Students

Contact

Kisa Kowal
(847) 491-3974
Email

Interest

  • Academic (general)

Add Event To My Group

Please sign-in