The composite absolute penalties family for grouped and hierarchical variable selection

Type: Article

Publication Date: 2009-08-17

Citations: 578

DOI: https://doi.org/10.1214/07-aos584

Abstract

Extracting useful information from high-dimensional data is an important focus of today’s statistical research and practice. Penalized loss function minimization has been shown to be effective for this task both theoretically and empirically. With the virtues of both regularization and sparsity, the L1-penalized squared error minimization method Lasso has been popular in regression models and beyond. In this paper, we combine different norms including L1 to form an intelligent penalty in order to add side information to the fitting of a regression or classification model to obtain reasonable estimates. Specifically, we introduce the Composite Absolute Penalties (CAP) family, which allows given grouping and hierarchical relationships between the predictors to be expressed. CAP penalties are built by defining groups and combining the properties of norm penalties at the across-group and within-group levels. Grouped selection occurs for nonoverlapping groups. Hierarchical variable selection is reached by defining groups with particular overlapping patterns. We propose using the BLASSO and cross-validation to compute CAP estimates in general. For a subfamily of CAP estimates involving only the L1 and L∞ norms, we introduce the iCAP algorithm to trace the entire regularization path for the grouped selection problem. Within this subfamily, unbiased estimates of the degrees of freedom (df) are derived so that the regularization parameter is selected without cross-validation. CAP is shown to improve on the predictive performance of the LASSO in a series of simulated experiments, including cases with p≫n and possibly mis-specified groupings. When the complexity of a model is properly calculated, iCAP is seen to be parsimonious in the experiments.

Locations

  • The Annals of Statistics - View - PDF
  • arXiv (Cornell University) - View - PDF
  • DataCite API - View

Similar Works

Action Title Year Authors
+ Grouped and Hierarchical Model Selection through Composite Absolute Penalties 2007 Peng Zhao
Guilherme V. Rocha
Bin Yu
+ PDF Chat High-Dimensional LASSO-Based Computational Regression Models: Regularization, Shrinkage, and Selection 2019 Frank Emmert‐Streib
Matthias Dehmer
+ Regularized methods for high-dimensional and bi-level variable selection 2009 Patrick Breheny
+ Regularization: sparsity, structure and computation 2006 Bin Yu
Peng Zhao
+ SGPR: Sparse Group Penalized Regression for Bi-Level Variable Selection 2024 Gregor Buch
+ Variable selection in linear models 2013 Yuqi Chen
Pang Du
Yuedong Wang
+ PDF Chat MLGL: An R package implementing correlated variable selection by hierarchical clustering and group-Lasso 2022 Quentin Grimonprez
Samuel Blanck
Alain Célisse
Guillemette Marot
+ Group variable selection via SCAD-<i>L</i><sub>2</sub> 2012 Lingmin Zeng
Jun Xie
+ Balancing Statistical and Computational Precision and Applications to Penalized Linear Regression with Group Sparsity 2016 Mahsa Taheri
Néhémy Lim
Johannes Lederer
+ Feature Grouping Using Weighted l1 Norm for High-Dimensional Data 2016 Bhanukiran Vinzamuri
Karthik K. Padthe
Chandan K. Reddy
+ PDF Chat Sparse Group Penalties for bi‐level variable selection 2024 Gregor Buch
Andreas Schulz
Irene Schmidtmann
Konstantin Strauch
Philipp S. Wild
+ Variable selection using <i>L<sub>q</sub></i> penalties 2014 Xingye Qiao
+ PDF Chat A small review and further studies on the LASSO 2013 Sunghoon Kwon
Sang-Mi Han
Sang-In Lee
+ SPARSE REGULARIZATION FOR BI-LEVEL VARIABLE SELECTION 2015 Hidetoshi Matsui
+ PDF Chat Spline-Lasso in High-Dimensional Linear Regression 2015 Jianhua Guo
Jianchang Hu
Bing‐Yi Jing
Zhen Zhang
+ Nonconvex selection in nonparametric additive models 2014 Xiangmin Zhang
+ Penalized regression for discrete structures 2015 Margret‐Ruth Oelker
+ PDF Chat Discussion of ‘Correlated variables in regression: Clustering and sparse estimation’ by Peter Bühlmann, Philipp Rütimann, Sara van de Geer and Cun-Hui Zhang 2013 Rajen D. Shah
Richard J. Samworth
+ Efficient Clustering of Correlated Variables and Variable Selection in High-Dimensional Linear Models 2016 Niharika Gauraha
Swapan K. Parui
+ PDF Chat Sélection de groupes de variables corrélées par classification ascendante hiérarchique et group-lasso 2015 Quentin Grimonprez
Alain Célisse
Guillemette Marot