I am an associate professor in statistics, in economics (adjunct) at the University of Washington, and an affiliated investigator in Fred Hutchinson Cancer Research Center. I obtained my Ph.D. (Biostatistics) from Johns Hopkins University in 2015. The last two years of my graduate study were supported by a Google Ph.D. Fellowship. Previously, I received my B.S. (Mathematics) from Peking University and M.S. (Biostatistics) from University of Minnesota.
I am a current associate editor for Bernoulli (01/2022-present) and an editorial advisory board member
of Dependence Modeling (01/2025-present). My research is supported by NSF DMS-1712536, SES-2019363, and DMS-2210019.
I am a mathematical statistician.
(Beginning 2022 Spring, we fully switch to Canvas and the following websites will no longer be updated.)
An Introduction to Permutation Processes
- Rank- and graph-based methods (measure of dependence) - On a rank-based Azadkia-Chatterjee correlation coefficient Limit theorems of Chatterjee's rank correlation Azadkia-Chatterjee's correlation coefficient adapts to manifold data On the failure of the bootstrap for Chatterjee's rank correlation On Azadkia-Chatterjee's conditional dependence coefficient On boosting the power of Chatterjee's rank correlation (program code) On the power of Chatterjee's rank correlation On extensions of rank correlation coefficients to multivariate spaces
High dimensional consistent independence testing with maxima of rank correlations Distribution-free tests of independence in high dimensions - Rank- and graph-based methods (causal inference) - On the limiting variance of matching estimators On the consistency of bootstrap for matching estimators On the adaptation of causal forests to manifold data On regression-adjusted imputation estimators of the average treatment effect On Rosenbaum's rank-based matching estimator On propensity score matching with a diverging number of matches Estimation based on nearest neighbor matching: from density ratio to average treatment effect - Rank- and graph-based methods (regression, PCA, graphical models, and etc.) - Robust functional principal component analysis via a functional pairwise spatial sign operator On rank estimators in increasing dimensions ECA: High dimensional elliptical component analysis in non-Gaussian distributions (program code) On inference validity of weighted U-statistics under data heterogeneity
Robust inference of risks of large portfolios Sparse median graphs estimation in a high dimensional semiparametric model High dimensional semiparametric scale-invariant principal component analysis Scale-invariant sparse PCA on high dimensional meta-elliptical data CODA: high dimensional copula discriminant analysis High dimensional semiparametric Gaussian copula graphical models - Statistical optimal transport (mixture models) - A sliced Wasserstein and diffusion approach to random coefficient models Smoothed NPMLEs in nonparametric Poisson mixtures and beyond Fisher-Pitman permutation tests based on nonparametric Poisson mixtures with application to single cell genomics Nonparametric mixture MLEs under Gaussian-smoothed optimal transport distance - Statistical optimal transport (multivariate ranks) - Distribution-free tests of multivariate independence based on center-outward quadrant, Spearman, Kendall, and van der Waerden statistics On universally consistent and fully distribution-free rank tests of vector independence Distribution-free consistent independence tests via center-outward ranks and signs (program code) - Nonparametric and semiparametric regressions - Adaptive estimation of high dimensional partially linear model (program) (supplement) On a phase transition in general order spline regression Optimal estimation of variance in nonparametric regression with random design On estimation of isotonic piecewise constant signals A provable smoothing approach for high dimensional generalized regression with applications in genomics - Time series analysis - Tail behavior of dependent V-statistics and its applications Estimation and inference on Granger causality in a latent high-dimensional Gaussian vector autoregressive model Probability inequalities for high dimensional time series under a triangular array framework Moment bounds for large autocovariance matrices under dependence Exponential inequalities for dependent V-statistics via random Fourier features An exponential inequality for U-statistics under mixing conditions Joint estimation of multiple graphical models from high dimensional dependent data A direct estimation of high dimensional stationary vector autoregressions - Random matrix theory - Robust scatter matrix estimation for high dimensional distributions with heavy tail Asymptotic joint distribution of extreme eigenvalues and trace of large sample covariance matrix in a generalized spiked population model An extreme-value approach for testing the equality of large U-statistic based correlation matrices On Gaussian comparison inequality and its application to spectral analysis of large random matrices Statistical analysis of latent generalized correlation matrix estimation in transelliptical distribution - Others - Challenges of big data analysis Searching for differentially expressed genes by PLS-VIP method - Applications - The Baltimore declaration toward the exploration of organoid intelligence First organoid intelligence (OI) workshop to form an OI community Shell microelectrode arrays (MEAs) for brain organoids Individual level differential expression analysis for single cell RNA-seq data Genome-wide
profiling of multiple histone methylations in olfactory cells: further implications for
cellular susceptibility to oxidative stress in schizophrenia Automated diagnoses of attention defficit hyperactive disorder using MRI A composite likelihood approach to latent multivariate
Gaussian modeling of SNP data with application to genetic association testing Powerful multi-marker association tests: unifying genomic distance-based regression and logistic regression A data-adaptive sum test for disease association with multiple common or rare variants Test selection with application to detecting disease association with multiple SNPs
Robust portfolio optimization Robust estimation of transition matrices in high dimensional heavy-tailed vector autoregressive processes Context aware group nearest shrunken centroids in large-scale genomic studies
Robust sparse principal component regression under the high dimensional elliptical model Transition matrix estimation in high dimensional vector autoregressive models Sparse principal component analysis for high dimensional multivariate time series Principal component analysis on non-Gaussian dependent data Transelliptical component analysis Semiparametric principal component analysis Transelliptical graphical models The nonparanormal SKEPTIC
Kolmogorov dependence theory Transelliptical graphical modeling under a hierarchical latent variable framework |