Northwestern Events Calendar

May
30
2018

Biostatistics Seminar

When: Wednesday, May 30, 2018
11:00 AM - 12:00 PM CT

Where: 680 N. Lake Shore Drive, Suite 1400 - Stamler Conference Room, Chicago, IL 60611 map it

Audience: Public

Contact: Catherine McDonnell   (312) 908-7914

Group: Department of Preventive Medicine

Category: Academic

Description:

Bangxin Zhao, PhD
Department of Statistics and Actuarial Science
University of Western Ontario


Correlation Learning in High Dimensional Data
In this talk, we discuss new methodologies targeting the areas of high-dimensional variable screening, influence measure and post-selection inference by the theme of correlation learning. We propose a new estimator for the correlation between the response and high-dimensional predictor variables, and based on the estimator we develop a new screening technique termed Dynamic Tilted Current Correlation Screening (DTCCS) for high dimensional variables screening. DTCCS is capable of picking up the relevant predictor variables within a finite number of steps. The DTCCS method takes the popular used sure independent screening (SIS) method and the high-dimensional ordinary least squares projection (HOLP) approach as its special cases.

Two methods of high-dimensional influence measure have also been explored. They are from the perspective of the extreme value distribution (EVD) and the robustness of design respectively. For the first method, EVD-type statistics have been shown to be powerful in measuring high-dimensional influence theoretically and numerically. From the second method, we propose Hellinger distance for high-dimensional influence measure (HD-HIM). Inner product of two transformed influence function is used to measure the Hellinger distance of two discrete distribution function from the whole and deleted dataset. This construction gives detecting power to flag the influence observations.

Lastly, we propose a new numerically feasible post-selection inference method termed Cosine PoSI in high-dimensional framework. Cosine PoSI focus on the geometric aspect of Least Angle Regression (LARS). LARS efficiently provide a solution path along which the entered predictors always have the same absolute correlation with the current residual. At each step of the LARS algorithm, the proposed Cosine PoSI method employs an angle from the correlation between the entering variable and current residual and considers this angle as a random variable from the cosine distribution. The post-selection inference is then conducted based on the order statistics of this cosine distribution. Given the collection of the possible angles, hypothesis tests are performed on the limiting distribution of the maximum angle. To confirm the effectiveness of the proposed method, we conduct simulation studies and a real-life data analysis to illustrate the usefulness of this post-selection method.

Add to Calendar

Add Event To My Group:

Please sign-in