Estimating high-dimensional covariance and precision matrices under general missing dependence

Seongoh Park, Xinlei Wang, Johan Lim

Type: Article

Publication Date: 2021-01-01

Citations: 11

DOI: https://doi.org/10.1214/21-ejs1892

Abstract

A sample covariance matrix S of completely observed data is the key statistic in a large variety of multivariate statistical procedures, such as structured covariance/precision matrix estimation, principal component analysis, and testing of equality of mean vectors. However, when the data are partially observed, the sample covariance matrix from the available data is biased and does not provide valid multivariate procedures. To correct the bias, a simple adjustment method called inverse probability weighting (IPW) has been used in previous research, yielding the IPW estimator. The estimator can play the role of S in the missing data context, thus replacing S in off-the-shelf multivariate procedures such as the graphical lasso algorithm. However, theoretical properties (e.g. concentration) of the IPW estimator have been only established in earlier work under very simple missing structures; every variable of each sample is independently subject to missingness with equal probability. We investigate the deviation of the IPW estimator when observations are partially observed under general missing dependency. We prove the optimal convergence rate Op( logp∕n) of the IPW estimator based on the element-wise maximum norm, even when two unrealistic assumptions (known mean and/or missing probabilities) frequently assumed to be known in the past work are relaxed. The optimal rate is especially crucial in estimating a precision matrix, because of the "meta-theorem" [26] that claims the rate of the IPW estimator governs that of the resulting precision matrix estimator. In the simulation study, we discuss one of practically important issues, non-positive semi-definiteness of the IPW estimator, and compare the estimator with imputation methods.

Locations

arXiv (Cornell University) - View - PDF
Electronic Journal of Statistics - View - PDF

Similar Works

Action	Title	Year	Authors
+	Estimating High-dimensional Covariance and Precision Matrices under General Missing Dependence	2020	Seongoh Park Xinlei Wang Johan Lim
+	Estimating High-dimensional Covariance and Precision Matrices under General Missing Dependence	2020	Seongoh Park Xinlei Wang Johan Lim
+	High-dimensional Covariance/Precision Matrix Estimation under General Missing Dependency	2020	박성오
+	Robust high-dimensional precision matrix estimation	2015	Viktoria Öllerer Christophe Croux
+	Robust high-dimensional precision matrix estimation	2015	Viktoria Öllerer Christophe Croux
+	New estimation methods for high dimensional inverse covariance matrices	2016	Vahe Avagyan
+	Estimating sparse precision matrices from data with missing values	2012	Mladen Kolar Eric P. Xing
+ PDF Chat	Robust High-Dimensional Precision Matrix Estimation	2014	Viktoria Oellerer Christophe Croux
+	On the Precision Matrix in Semi-High-Dimensional Settings	2020	Kentaro Hayashi Ke‐Hai Yuan Ge Jiang
+	Estimating high-dimensional covariance matrices with misses for Kronecker product expansion models	2016	Mahdi Zamanighomi Zhengdao Wang Georgios B. Giannakis
+	A unified theory of confidence intervals for high-dimensional precision matrix	2021	Yue Wang Yang Li Zemin Zheng
+	Tests of Missing Completely At Random based on sample covariance matrices	2024	Alberto Bordino Thomas B. Berrett
+	Precision Matrix Estimation with Noisy and Missing Data	2019	Roger Fan Byoungwook Jang Yuekai Sun Shuheng Zhou
+	Bayesian Estimation of the Precision Matrix with Monotone Missing Data	2020	Emna Ghorbel Kaouthar Kammoun Mahdi Louati
+ PDF Chat	Concentration of a sparse Bayesian model with Horseshoe prior in estimating high-dimensional precision matrix	2024	The Tien Mai
+	Advanced Computation of Sparse Precision Matrices for Big Data	2017	Abdelkader Baggag Halima Bensmail Jaideep Srivastava
+	High-dimensional covariance matrix estimation	2020	Clifford Lam
+ PDF Chat	Concentration of a Sparse Bayesian Model With Horseshoe Prior in Estimating High‐Dimensional Precision Matrix	2024	The Tien Mai
+	Regularized estimation of precision matrix for high-dimensional multivariate longitudinal data	2019	Fang Qian Yu Chen Weiping Zhang
+	Estimation and imputation in Probabilistic Principal Component Analysis with Missing Not At Random data	2019	Aude Sportisse Claire Boyer Julie Josse

Works That Cite This (10)

Action	Title	Year	Authors
+	Sparse precision matrix estimation with missing observations	2022	Ning Zhang Yang Jin
+	The Effects of Data Imputation on Covariance and Inverse Covariance Matrix Estimation	2024	Tuan L. Vo Quan Huu Thu Nguyen Pål Halvorsen Michael A. Riegler Binh T. Nguyen
+ PDF Chat	Graphical Model Inference with Erosely Measured Data	2023	Lili Zheng Genevera I. Allen
+	Group-wise monitoring of multivariate data with missing values	2024	Kwangok Seo Johan Lim Youngrae Kim
+	Optimal estimation of high-dimensional sparse covariance matrices with missing data	2024	Li Miao Jinru Wang
+	High-dimensional missing data imputation via undirected graphical model	2024	Yoonah Lee Seongoh Park
+ PDF Chat	Covariance estimation under missing observations and L4−L2 moment equivalence	2024	Pedro Pugliesi Abdalla
+	Monitoring multivariate data with high missing rate by pooling univariate statistics	2022	Youngrae Kim Seonghun Cho Johan Lim
+ PDF Chat	Sparse Hanson–Wright inequality for a bilinear form of sub‐Gaussian variables	2022	Seongoh Park Xinlei Wang Johan Lim
+	An overview of heavy-tail extensions of multivariate Gaussian distribution and their relations	2022	Seongoh Park Johan Lim

Works Cited by This (26)

Action	Title	Year	Authors
+	Statistical Methods for Handling Incomplete Data	2013	Jae Kwang Kim Jun Shao
+	Limit Theorems for Large Deviations	1991	L. Saulis V. Statulevičius
+ PDF Chat	Inverse covariance estimation from data with missing values using the Concave-Convex Procedure	2014	Jérôme Thai Timothy Hunter Anayo K. Akametalu Claire J. Tomlin Alexandre M. Bayen
+	Estimation of Large Covariance Matrices of Longitudinal Data With Basis Function Approximations	2007	Jianhua Z. Huang Linxu Liu Naiping Liu
+ PDF Chat	Solving the Matrix Nearness Problem in the Maximum Norm by Applying a Projection and Contraction Method	2012	Minghua Xu Hu Shao
+ PDF Chat	Hanson-Wright inequality and sub-gaussian concentration	2013	Mark Rudelson Roman Vershynin
+	Generalized Thresholding of Large Covariance Matrices	2009	Adam Rothman Elizaveta Levina Ji Zhu
+ PDF Chat	Missing values: sparse inverse covariance estimation and an extension to sparse regression	2010	Nicolas Städler Peter Bühlmann
+	<b>mice</b>: Multivariate Imputation by Chained Equations in<i>R</i>	2011	Stef van Buuren Catharina G. M. Groothuis‐Oudshoorn
+	Sparse inverse covariance estimation with the graphical lasso	2007	Jerome H. Friedman Trevor Hastie R. Tibshirani