Skip to main content

Statistics and Data Science Seminar: "Data-Centric Learning: Aligning Data and Model Knowledge for Better AI"

Friday, May 15, 2026 | 11:00 AM - 12:00 PM CT
Chambers Hall, Ruan Conference Room – lower level, 600 Foster St, Evanston, IL 60208 map it

Curing AI Issues at the Source: The Power of Data-Centric Learning

Yanjie Fu, Associate Professor, School of Computing and Augmented Intelligence, Arizona State University

Abstract: Recent progress in AI has been driven largely by scaling models and compute. Yet in many real-world and scientific settings, AI failures are still rooted less in model architecture than in the data itself: missing or incomplete observations, noisy labels, distribution shift, imbalance, poor feature geometry, and weak coverage of the underlying domain. This talk argues for a shift from a model-centric view of AI to a data-centric learning perspective, where the central goal is not only to train better models, but to reshape better data for learning. I will present a unifying view of data-centric learning through the lens of data-model knowledge alignment: data serves as the knowledge base, models learn knowledge from data, and poor alignment between the two leads to poor generalization, shortcut learning, instability, and low trust. I will introduce key directions in this space, including data curation, relabeling, synthetic data generation, feature selection, feature transformation, and data reprogramming. I will also highlight our recent work on AI4Data-RL, AI4Data-GenAi, AI4Data-LLM&Agents. Overall, the talk will discuss how data-centric learning opens a new path toward more robust and trustworthy AI systems.

Cost: free

Audience

  • Faculty/Staff
  • Student
  • Post Docs/Docs
  • Graduate Students

Contact

Kisa Kowal
(847) 491-3974
Email

Interest

  • Academic (general)

Add Event To My Group

Please sign-in