When:
Wednesday, October 24, 2018
11:00 AM - 12:00 PM CT
Where: 2006 Sheridan Road, B02, 2006 Sheridan Road , Evanston, IL 60208 map it
Audience: Faculty/Staff - Post Docs/Docs - Graduate Students
Contact:
Kisa Kowal
(847) 491-3974
Group: Department of Statistics and Data Science
Category: Academic
Department of Statistics Fall 2018 Seminar Series
A classification method for predicting type 2 diabetes mellitus using sequencing data
Speaker: Haiyan Wang, Professor, Department of Statistics, Kansas state University
Time: 11:00am
Abstract: Type 2 diabetes mellitus (T2DM) affects the lives of millions of people through its life-altering complications. Current methods of identifying genetic polymorphisms responsible for T2DM face the limitation of sample size and low accuracy at the population level (AUC of 0.68 or below). This research presents a method to identify subtle effects of genetic variants using whole genome sequencing data and improve prediction accuracy of T2DM at the population level. To achieve this, a new feature selection procedure and a classier were proposed. The method involves (1) first applying sparse principal component analysis (PCA) to genotype data to obtain orthogonal features; (2) using SNP-specific regularization parameters to reduce the false positive rate of feature selection; (3) verifying feature relevance through Lasso penalized logistic regression in conjunction with sparse PCA. After applying to a dataset containing 625,597 SNPs and 23 environmental variables from each of 3,326 humans, the method identified over 450 genetic variants that each have subtle effects on T2DM prediction. These variants, in conjunction with clinical characteristics, led to greatly improved prediction accuracy (AUC 0.79) for new patients at the population level. The proposed method also has the advantage of computational efficiency, which is 20 times faster than Random Forest classifier, and thus provides a promising tool for large-scale genome-wide association studies.
Joint work with Luann C Jung at Massachusetts Institute of Technology, Xukun Li and Cen Wu at Kansas State University.
Location: Basement classroom - B02, Department of Statistics, 2006 Sheridan Road