Statistical Mechanics of Optimal Convex Inference in High Dimensions

Type: Article

Publication Date: 2016-08-29

Citations: 48

DOI: https://doi.org/10.1103/physrevx.6.031034

Abstract

To model modern large-scale datasets, we need efficient algorithms to infer a set of $P$ unknown model parameters from $N$ noisy measurements. What are fundamental limits on the accuracy of parameter inference, given finite signal-to-noise ratios, limited measurements, prior information, and computational tractability requirements? How can we combine prior information with measurements to achieve these limits? Classical statistics gives incisive answers to these questions as the measurement density $\alpha = \frac{N}{P}\rightarrow \infty$. However, these classical results are not relevant to modern high-dimensional inference problems, which instead occur at finite $\alpha$. We formulate and analyze high-dimensional inference as a problem in the statistical physics of quenched disorder. Our analysis uncovers fundamental limits on the accuracy of inference in high dimensions, and reveals that widely cherished inference algorithms like maximum likelihood (ML) and maximum-a posteriori (MAP) inference cannot achieve these limits. We further find optimal, computationally tractable algorithms that can achieve these limits. Intriguingly, in high dimensions, these optimal algorithms become computationally simpler than MAP and ML, while still outperforming them. For example, such optimal algorithms can lead to as much as a 20% reduction in the amount of data to achieve the same performance relative to MAP. Moreover, our analysis reveals simple relations between optimal high dimensional inference and low dimensional scalar Bayesian inference, insights into the nature of generalization and predictive power in high dimensions, information theoretic limits on compressed sensing, phase transitions in quadratic inference, and connections to central mathematical objects in convex optimization theory and random matrix theory.

Locations

  • Physical Review X - View - PDF
  • arXiv (Cornell University) - View - PDF
  • DOAJ (DOAJ: Directory of Open Access Journals) - View
  • DataCite API - View

Similar Works

Action Title Year Authors
+ Bayes-optimal limits in structured PCA, and how to reach them 2022 Jean Barbier
Francesco Camilli
Marco Mondelli
Manuel Sáenz
+ High-dimensional statistics, sparsity and inference 2013 Sara van de Geer
+ PDF Chat Sparse high-dimensional linear regression. Estimating squared error and a phase transition 2022 David Gamarnik
Ilias Zadik
+ Information-theoretic limits on sparsity recovery in the high-dimensional and noisy setting 2007 Martin J. Wainwright
+ PDF Chat High-Dimensional Probability 2018 Roman Vershynin
+ PDF Chat Fundamental limits of high-dimensional estimation : a stroll between statistical physics, probability and random matrix theory 2021 Antoine Maillard
+ Fundamental limits in structured principal component analysis and how to reach them 2023 Jean Barbier
Francesco Camilli
Marco Mondelli
Manuel Sáenz
+ High-Dimensional Probability: An Introduction with Applications in Data Science 2018 Roman Vershynin
+ High-dimensional Variable Selection with Sparse Random Projections: Measurement Sparsity and Statistical Efficiency 2010 Dapo Omidiran
Martin J. Wainwright
+ Automated Scalable Bayesian Inference via Hilbert Coresets 2019 Trevor Campbell
Tamara Broderick
+ Automated Scalable Bayesian Inference via Hilbert Coresets 2017 Trevor Campbell
Tamara Broderick
+ High-Dimensional Statistics 2019 Martin J. Wainwright
+ High-Dimensional Statistics: A Non-Asymptotic Viewpoint 2019 Martin J. Wainwright
+ Computation-information gap in high-dimensional clustering 2024 Bertrand Even
Christophe Giraud
Nicolas Verzélen
+ PDF Chat Understanding Phase Transitions via Mutual Information and MMSE 2021 Galen Reeves
Henry D. Pfister
+ Observed universality of phase transitions in high-dimensional geometry, with implications for modern data analysis and signal processing 2009 David L. Donoho
Jared Tanner
+ Information-Theoretic Limits for the Matrix Tensor Product 2020 Galen Reeves
+ Information-Theoretic Limits for the Matrix Tensor Product 2020 Galen Reeves
+ Computational and Statistical Aspects of High-Dimensional Structured Estimation 2018 Sheng Chen
+ Understanding Phase Transitions via Mutual Information and MMSE 2019 Galen Reeves
Henry D. Pfister