Estimation Stability With Cross-Validation (ESCV)

Type: Article

Publication Date: 2015-04-03

Citations: 91

DOI: https://doi.org/10.1080/10618600.2015.1020159

Abstract

Cross-validation (CV) is often used to select the regularization parameter in high-dimensional problems. However, when applied to the sparse modeling method Lasso, CV leads to models that are unstable in high-dimensions, and consequently not suited for reliable interpretation. In this article, we propose a model-free criterion ESCV based on a new estimation stability (ES) metric and CV. Our proposed ESCV finds a smaller and locally ES-optimal model smaller than the CV choice so that it fits the data and also enjoys estimation stability property. We demonstrate that ESCV is an effective alternative to CV at a similar easily parallelizable computational cost. In particular, we compare the two approaches with respect to several performance measures when applied to the Lasso on both simulated and real datasets. For dependent predictors common in practice, our main finding is that ESCV cuts down false positive rates often by a large margin, while sacrificing little of true positive rates. ESCV usually outperforms CV in terms of parameter estimation while giving similar performance as CV in terms of prediction. For the two real datasets from neuroscience and cell biology, the models found by ESCV are less than half of the model sizes by CV, but preserves CV's predictive performance and corroborates with subject knowledge and independent work. We also discuss some regularization parameter alignment issues that come up in both approaches. Supplementary materials are available online.

Locations

  • arXiv (Cornell University) - PDF
  • Journal of Computational and Graphical Statistics - View

Similar Works

Action Title Year Authors
+ Estimation Stability with Cross Validation (ESCV) 2013 Chinghway Lim
Bin Yu
+ The Instability of Cross-Validated Lasso 2013 Kine Veronica Lund
+ Cross-Validated Loss-Based Covariance Matrix Estimator Selection in High Dimensions 2021 Philippe Boileau
Nima S. Hejazi
Mark J. van der Laan
Sandrine Dudoit
+ PDF Chat nestedcv: an R package for fast implementation of nested cross-validation with embedded feature selection designed for transcriptomics and high-dimensional data 2023 Myles Lewis
Athina Spiliopoulou
Katriona Goldmann
Costantino Pitzalis
Paul McKeigue
Michael R. Barnes
+ Consistent selection of tuning parameters via variable selection stability 2013 Wei Sun
Junhui Wang
Yixin Fang
+ Loss-guided Stability Selection 2022 Tino Werner
+ $\left( β, \varpi \right)$-stability for cross-validation and the choice of the number of folds 2017 Ning Xu
Jian Hong
Timothy S. Fisher
+ PDF Chat On the Selection Stability of Stability Selection and Its Applications 2024 Mahdi Nouraie
Samuel Müller
+ Rademacher upper bounds for cross-validation errors with an application to the lasso 2020 Ning Xu
Timothy S. Fisher
Jian Hong
+ Consistent selection of tuning parameters via variable selection stability 2012 Wei Sun
Junhui Wang
Yixin Fang
+ PDF Chat Loss-guided stability selection 2023 Tino Werner
+ PDF Chat Cross-Validation With Confidence 2019 Jing Lei
+ The restricted consistency property of leave-$n_v$-out cross-validation for high-dimensional variable selection 2013 Yang Feng
Yi Yu
+ Cross-Validated Loss-based Covariance Matrix Estimator Selection in High Dimensions 2022 Philippe Boileau
Nima S. Hejazi
Mark J. van der Laan
Sandrine Dudoit
+ Cross-Validated Loss-based Covariance Matrix Estimator Selection in High Dimensions 2022 Philippe Boileau
Nima S. Hejazi
Mark J. van der Laan
Sandrine Dudoit
+ Gain Confidence, Reduce Disappointment: A New Approach to Cross-Validation for Sparse Regression 2023 Ryan Cory-Wright
Andrés Gómez
+ PDF Chat Predictive Performance Test based on the Exhaustive Nested Cross-Validation for High-dimensional data 2024 Iris Ivy Gauran
Hernando Ombao
Zhaoxia Yu
+ Cross-Validation with Confidence 2017 Jing Lei
+ Cross-validation and regression analysis in high-dimensionalsparse linear models 2011 Feng Zhang
+ Stability Approach to Regularization Selection (StARS) for High Dimensional Graphical Models. 2010 Han Liu
Kathryn Roeder
Larry Wasserman