This lecture introduces data anonymization as a formal approach to protecting individual privacy when releasing datasets, motivating the need for rigorous guarantees through the well-known failures of naive de-identification such as linkage attacks and quasi-identifier re-identification. The lecture covers core Statistical Disclosure Control (SDC) concepts and a range of anonymization techniques including generalization, suppression, pseudonymization, and data perturbation. Practical methodology for applying anonymization in context-specific settings is presented with attention to the inherent privacy-utility tradeoff that practitioners must navigate.
Audience
- Faculty/Staff
- Student
- Graduate Students
Contact
Master of Science in Machine Learning and Data Science Program
Email
Interest
- Academic (general)