Information-theoretic bounds and phase transitions in clustering, sparse PCA, and submatrix localization

Type: Preprint

Publication Date: 2017-06-01

Citations: 16

DOI: https://doi.org/10.1109/isit.2017.8006706

Download PDF

Abstract

We study the problem of detecting a structured, low-rank signal matrix corrupted with additive Gaussian noise. This includes clustering in a Gaussian mixture model, sparse PCA, and submatrix localization. Each of these problems is conjectured to exhibit a sharp information-theoretic threshold, below which the signal is too weak for any algorithm to detect. We derive upper and lower bounds on these thresholds by applying the first and second moment methods to the likelihood ratio between these "planted models" and null models where the signal matrix is zero. For sparse PCA and submatrix localization, we determine this threshold exactly in the limit where the number of blocks is large or the signal matrix is very sparse; for the clustering problem, our bounds differ by a factor √2 when the number of clusters is large. Moreover, our upper bounds show that for each of these problems there is a significant regime where reliable detection is information-theoretically possible but where known algorithms such as PCA fail completely, since the spectrum of the observed matrix is uninformative. This regime is analogous to the conjectured `hard but detectable' regime for community detection in sparse graphs.

Locations

  • eScholarship (California Digital Library) - View - PDF
  • arXiv (Cornell University) - View - PDF
  • 2022 IEEE International Symposium on Information Theory (ISIT) - View

Similar Works

Action Title Year Authors
+ Information-Theoretic Bounds and Phase Transitions in Clustering, Sparse PCA, and Submatrix Localization 2018 Jess Banks
Cristopher Moore
Roman Vershynin
Nicolas Verzélen
Jiaming Xu
+ Information-theoretic bounds and phase transitions in clustering, sparse PCA, and submatrix localization 2016 Jess Banks
Cristopher Moore
Nicolas Verzélen
Roman Vershynin
Jiaming Xu
+ PDF Chat Fundamental limits of symmetric low-rank matrix estimation 2018 Marc Lelarge
Léo Miolane
+ PDF Chat Statistical and computational limits for sparse matrix detection 2020 Tommaso Cai
Yihong Wu
+ Statistical and Computational Limits for Sparse Matrix Detection 2018 Tommaso Cai
Yihong Wu
+ Phase transitions in spiked matrix estimation: information-theoretic analysis 2018 Léo Miolane
+ PDF Chat Phase transitions in spiked matrix estimation: information-theoretic analysis 2018 Léo Miolane
+ Statistical-Computational Tradeoffs in Planted Problems and Submatrix Localization with a Growing Number of Clusters and Submatrices 2014 Yudong Chen
Jiaming Xu
+ Statistical-Computational Tradeoffs in Planted Problems and Submatrix Localization with a Growing Number of Clusters and Submatrices 2014 Yudong Chen
Jiaming Xu
+ Statistical and computational thresholds for the planted $k$-densest sub-hypergraph problem 2020 Luca Corinzia
P. La Penna
Wojciech Szpankowski
Joachim M. Buhmann
+ On the limitation of spectral methods: From the Gaussian hidden clique problem to rank one perturbations of Gaussian tensors 2014 Andrea Montanari
Daniel Reichman
Ofer Zeitouni
+ On the limitation of spectral methods: From the Gaussian hidden clique problem to rank one perturbations of Gaussian tensors 2014 Andrea Montanari
Daniel Reichman
Ofer Zeitouni
+ On the Limitation of Spectral Methods: From the Gaussian Hidden Clique Problem to Rank One Perturbations of Gaussian Tensors 2016 Andrea Montanari
Daniel Reichman
Ofer Zeitouni
+ Community detection and stochastic block models: recent developments 2017 Emmanuel Abbé
+ Statistical Problems with Planted Structures: Information-Theoretical and Computational Limits 2018 Yihong Wu
Jiaming Xu
+ Statistical and Computational Limits of Detecting and Recovering Hidden Submatrices 2024 Marom Dadon
Wasim Huleihel
Tamir Bendory
+ On the limitation of spectral methods: from the Gaussian hidden clique problem to rank-one perturbations of Gaussian tensors 2015 Andrea Montanari
Daniel Reichman
Ofer Zeitouni
+ Community Detection and Stochastic Block Models 2017 Emmanuel Abbé
+ PDF Chat Detection Thresholds in Very Sparse Matrix Completion 2022 Charles Bordenave
Simon Coste
Raj Rao Nadakuditi
+ Curse of Heterogeneity: Computational Barriers in Sparse Mixture Models and Phase Retrieval 2018 Jianqing Fan
Han Liu
Zhaoran Wang
Zhuoran Yang