Northwestern Events Calendar

Jun
28
2024

Python Scikit-Learn (In-Person)

When: Friday, June 28, 2024
9:30 AM - 3:30 PM CT

Where: Mudd Hall ( formerly Seeley G. Mudd Library), North Study Lounge, 2233 Tech Drive, Evanston, IL 60208 map it

Audience: Faculty/Staff - Student - Post Docs/Docs - Graduate Students

Cost: $5.00

Contact: Leticia Vega  

Group: Northwestern IT Research Computing and Data Services

Category: Training

Description:

Scikit-Learn is one of the major libraries for machine learning in Python. This workshop includes four sessions designed to give you a map of Scikit-Learn’s different functionalities and place you in firm ground to start using it for your own machine learning projects. This in-person workshop will not be recorded. 
 
Session 1: The first session introduces the fundamentals of machine learning (ML): How does ML differ from statistics? Which types of problems are best suited for ML? We will explore different types of real-world problems and learn to distinguish between supervised and unsupervised learning. You will become familiar with the main stages of the data science pipeline: data wrangling, cleaning and pre-processing, modeling, optimization and model validation, and post-processing and visualization.  

Session 2: The second session covers regression analysis, which is a powerful tool for uncovering the associations between features of your data (known as independent variables) and dependent variables (usually denoted by Y). In this workshop you will learn to identify ML tasks that are suited for regression analysis, process independent variables appropriately, train and evaluate models, and generate predictions. We will also discuss some common pitfalls and assumptions of the chosen modeling techniques.  

Session 3: The third session is about classification, which is the problem of identifying which class or category (label) an observation (features) belongs to from within a pre-defined set of categories. In this workshop, you will learn to identify classification problems, prepare the features and label data for modeling, train and evaluate models, and generate predictions. We will also discuss some common pitfalls and assumptions of the chosen modeling techniques.  

Session 4: The last session focuses on supervised learning, which uses machine learning to analyze unlabeled datasets without human supervision. Several real-world problems require discovering hidden patterns in data. In this workshop, you will learn about different unsupervised learning methods such as dimensionality reduction and clustering, and how to process your data to apply these algorithms. We will also discuss other machine learning methods and future steps.  
 
Prerequisites: Basic familiarity with Python is required. Familiarity with NumPy is highly recommended. No previous machine learning or statistics experience is necessary but will be useful. You will need to bring a laptop to participate. 

Registration Required. 

Register More Info Add to Calendar

Add Event To My Group:

Please sign-in