When:
Thursday, August 16, 2018
9:00 AM - 12:00 PM CT
Where: Chambers Hall, Lower Level, 600 Foster St, Evanston, IL 60208 map it
Audience: Faculty/Staff - Post Docs/Docs - Graduate Students
Cost: $10
Contact:
Christina Maimone
Group: Northwestern Information Technology
Category: Training
You’ve collected or received your text data and need to clean them for analysis. In this workshop we’ll go over the types of cleaning you might need to do given your research question, and how to do it.
Things you’ll learn in this workshop:
Tokenization
(Foreign) language detection
Stemming and lemmatization
Stoplisting (removing some words)
Classifying words by semantic type (e.g. emotional, rational)