Beyond Parallel Pancakes: Quasi-Polynomial Time Guarantees for Non-Spherical Gaussian Mixtures

Type: Preprint

Publication Date: 2021-12-10

Citations: 0

Abstract

We consider mixtures of $k\geq 2$ Gaussian components with unknown means and unknown covariance (identical for all components) that are well-separated, i.e., distinct components have statistical overlap at most $k^{-C}$ for a large enough constant $C\ge 1$. Previous statistical-query lower bounds [DKS17] give formal evidence that even distinguishing such mixtures from (pure) Gaussians may be exponentially hard (in $k$). We show that this kind of hardness can only appear if mixing weights are allowed to be exponentially small, and that for polynomially lower bounded mixing weights non-trivial algorithmic guarantees are possible in quasi-polynomial time. Concretely, we develop an algorithm based on the sum-of-squares method with running time quasi-polynomial in the minimum mixing weight. The algorithm can reliably distinguish between a mixture of $k\ge 2$ well-separated Gaussian components and a (pure) Gaussian distribution. As a certificate, the algorithm computes a bipartition of the input sample that separates a pair of mixture components, i.e., both sides of the bipartition contain most of the sample points of at least one component. For the special case of colinear means, our algorithm outputs a $k$ clustering of the input sample that is approximately consistent with the components of the mixture. A significant challenge for our results is that they appear to be inherently sensitive to small fractions of adversarial outliers unlike most previous results for Gaussian mixtures. The reason is that such outliers can simulate exponentially small mixing weights even for mixtures with polynomially lower bounded mixing weights. A key technical ingredient is a characterization of separating directions for well-separated Gaussian components in terms of ratios of polynomials that correspond to moments of two carefully chosen orders logarithmic in the minimum mixing weight.

Locations

  • arXiv (Cornell University) - View - PDF

Similar Works

Action Title Year Authors
+ Beyond Parallel Pancakes: Quasi-Polynomial Time Guarantees for Non-Spherical Gaussian Mixtures 2021 Rares-Darius Buhai
David Steurer
+ Robustly Learning any Clusterable Mixture of Gaussians 2020 Ilias Diakonikolas
Samuel B. Hopkins
Daniel M. Kane
Sushrut Karmalkar
+ Robustly Learning any Clusterable Mixture of Gaussians. 2020 Ilias Diakonikolas
Samuel B. Hopkins
Daniel M. Kane
Sushrut Karmalkar
+ Settling the Polynomial Learnability of Mixtures of Gaussians 2010 Ankur Moitra
Gregory Valiant
+ Settling the Polynomial Learnability of Mixtures of Gaussians 2010 Ankur Moitra
Gregory Valiant
+ PDF Chat Sum-of-squares lower bounds for Non-Gaussian Component Analysis 2024 Ilias Diakonikolas
Sushrut Karmalkar
Shuo Pang
Aaron Potechin
+ PDF Chat Settling the Polynomial Learnability of Mixtures of Gaussians 2010 Ankur Moitra
Gregory Valiant
+ Mixture Models, Robustness, and Sum of Squares Proofs 2017 Samuel B. Hopkins
Jerry Li
+ PDF Chat Clustering Mixtures with Almost Optimal Separation in Polynomial Time 2024 Jerry Li
Allen Liu
+ PDF Chat Clustering Mixtures with Almost Optimal Separation in Polynomial Time 2021 Jerry Li
Allen Liu
+ Clustering Mixtures with Almost Optimal Separation in Polynomial Time 2021 Jerry Li
Allen Liu
+ On Learning Mixtures of Well-Separated Gaussians 2017 Oded Regev
Aravindan Vijayaraghavan
+ On Learning Mixtures of Well-Separated Gaussians 2017 Oded Regev
Aravindan Vijayaraghavan
+ Clustering mixtures with almost optimal separation in polynomial time 2022 Allen Liu
Jerry Li
+ PDF Chat Clustering mixture models in almost-linear time via list-decodable mean estimation 2022 Ilias Diakonikolas
Daniel M. Kane
Daniel Kongsgaard
Jerry Li
Kevin Tian
+ PDF Chat Efficient Certificates of Anti-Concentration Beyond Gaussians 2024 Ainesh Bakshi
Pravesh K. Kothari
Goutham Rajendran
Madhur Tulsiani
Aravindan Vijayaraghavan
+ Clustering Mixtures of Bounded Covariance Distributions Under Optimal Separation 2023 Ilias Diakonikolas
Daniel M. Kane
Jasper C. H. Lee
Thanasis Pittas
+ PDF Chat SQ Lower Bounds for Non-Gaussian Component Analysis with Weaker Assumptions 2024 Ilias Diakonikolas
Daniel M. Kane
Lisheng Ren
Yuxin Sun
+ PDF Chat SoS Certifiability of Subgaussian Distributions and its Algorithmic Applications 2024 Ilias Diakonikolas
Samuel B. Hopkins
Ankit Pensia
Stefan Tiegel
+ Robust Learning of Mixtures of Gaussians 2020 Daniel M. Kane

Works That Cite This (0)

Action Title Year Authors

Works Cited by This (0)

Action Title Year Authors