Skip to main content

Statistics and Data Science Seminar: "Towards Data-efficient Training of Large Language Models (LLMs)" (Zoom)

Friday, February 14, 2025 | 11:00 AM - 12:00 PM CT
Online

Towards Data-efficient Training of Large Language Models (LLMs)

Baharan Mirzasoleiman, Assistant Professor, Computer Science Department, UCLA

Abstract: High quality data is crucial for training LLMs with superior performance. In this talk, I will present two theoretically-rigorous approaches to find smaller subsets of examples that can improve the performance and efficiency of training LLMs. First, I will present a one-shot data selection method for supervised fine-tuning of LLMs. Then, I'll talk about an iterative data selection strategy to pretrain or fine-tune LLMs on imbalanced mixtures of language data. I'll conclude by showing empirical results confirming that the above data selection strategies can effectively improve the performance of various LLMs during fine-tuning and pretraining.

Cost: free

Audience

  • Faculty/Staff
  • Student
  • Post Docs/Docs
  • Graduate Students

Contact

Kisa Kowal   (847) 491-3974

k-kowal@northwestern.edu

Interest

  • Academic (general)

Add Event To My Group

Please sign-in