Northwestern Events Calendar

Apr
28
2015

IEMS Seminar: Large Topic Models: Efficient Inference and Applications

When: Tuesday, April 28, 2015
11:00 AM - 12:00 PM CT

Where: Technological Institute, M228, 2145 Sheridan Road, Evanston, IL 60208 map it

Audience: Faculty/Staff - Student - Public - Post Docs/Docs - Graduate Students

Contact: Agnes Kaminski   (847) 491-3576

Group: Department of Industrial Engineering and Management Sciences (IEMS)

Category: Lectures & Meetings

Description:

Doug Downey

Northwestern University - EECS Department

 

Abstract: Latent variable topic models such as Latent Dirichlet Allocation (LDA) can discover topics from text in an unsupervised fashion. However, scaling the models up to the many distinct topics exhibited in modern corpora is challenging. ``Flat'' topic models like LDA have difficulty modeling sparsely expressed topics, and richer hierarchical models become computationally intractable as the number of topics increases. In this talk, I will introduce efficient methods for inferring large topic hierarchies. The approach is built upon the Sparse Backoff Tree (SBT), a new prior for latent topic distributions that organizes the latent topics as leaves in a tree. I will show how a document model based on SBTs can effectively infer accurate topic spaces of over a million topics. Experiments demonstrate that scaling to large topic spaces results in much more accurate models, and that SBT document models make use of large topic spaces more effectively than flat LDA. Lastly, I will describe how the models can be used to power Atlasify, a prototype exploratory search engine.


Biography: Doug Downey is an Associate Professor in EECS at Northwestern University. His research focuses on natural language processing, machine learning, and artificial intelligence.

Add to Calendar

Add Event To My Group:

Please sign-in