Methods of variable selection in regression modeling

Type: Article

Publication Date: 1998-01-01

Citations: 49

DOI: https://doi.org/10.1080/03610919808813505

Abstract

Simulation was used to evaluate the performances of several methods of variable selection in regression modeling: stepwise regression based on partial F-tests, stepwise minimization of Mallows' C p statistic and Schwarz's Bayes Information Criterion (BIC), and regression trees constructed with two kinds of pruning. Five to 25 covariates were generated in multivariate clusters, and responses were obtained from an ordinary linear regression model involving three of the covariates; each data set had 50 observations. The regression-tree approaches were markedly inferior to the other methods in discriminating between informative and noninformative covariates, and their predictions of responses in "new" data sets were much more variable and less accurate than those of the other methods. The F-test, C p and BIC approaches were similar in their overall frequencies of "correct" decisions about inclusion or exclusion of covariates, with the C p method leading to the largest models and the BIC method to the smallest, The three methods were also comparable in their ability to predict "new" observations, with perhaps a tendency for the C p approach to perform relatively more poorly for large covariate pools. The abilities of all methods to discriminate between informative and noninformative covariates and to predict "new" observations decreased with increasing size of the covariate pool.

Locations

  • Communications in Statistics - Simulation and Computation - View
  • Zenodo (CERN European Organization for Nuclear Research) - View - PDF

Similar Works

Action Title Year Authors
+ Variable Selection in Linear Regression 2010 Charles Lindsey
Simon J. Sheather
+ PDF Chat Variable selection methods in regression: Ignorable problem, outing notable solution 2010 Bruce Ratner
+ PDF Chat Variable selection: current practice in epidemiological studies 2009 Stefan Walter
Henning Tiemeier
+ Some aspects of response variable selection and estimation in multivariate linear regression 2021 Jianhua Hu
Xiaoqian Liu
Xu Liu
Ningning Xia
+ PDF Chat FWDselect: An R Package for Variable Selection in Regression Models 2016 Marta Sestelo
Nora M. Villanueva
LuĆ­s Meiraā€Machado
Javier Rocaā€PardiƱas
+ Model Selection in Regression: Statistical and Scientific Perspectives 2020 Daniel J. Denis
+ Empirical Bayes methods in variable selection 2018 Haim Bar
Kangyan Liu
+ Variable Selection in Linear Regression With Many Predictors 2009 Airong Cai
Ruey S. Tsay
Rong Chen
+ Regression and Variable Selection 2020 Paola Lecca
+ PDF Chat Identifying Informative Predictor Variables With Random Forests 2023 Yannick Rothacher
Carolin Strobl
+ Model Selection 2022 Timothy DelSole
Michael K. Tippett
+ Model Selection in Regression Analysis 2010 Hubertus Brandner
Stefan Lessmann
Simone Garatti
Marco C. Campi
Michael Khachay
Russian Federation
Vadim Strijov
Klara Zetkin
Ekaterina Krymova
+ PDF Chat Variable selection ā€“ A review and recommendations for the practicing statistician 2018 Georg Heinze
Christine Wallisch
Daniela Dunkler
+ Variable Selection 2009 Simon J. Sheather
+ Bayesian Variable Selection in Linear Regression 1988 Toby J. Mitchell
John J. Beauchamp
+ PDF Chat Model Selection in Generalized Linear Models 2023 Abdulla Mamun
S. R. Paul
+ Variable Selection 1990 Ashish Sen
Muni S. Srivastava
+ model selection 2008 Jeanā€Marie Dufour
+ Model Selection 2008 Jeanā€Marie Dufour
+ Model Selection 2018 Jeanā€Marie Dufour

Works That Cite This (18)

Action Title Year Authors
+ PDF Chat On Determining the Effects of Therapy on Disease Damage in Non Randomized Studies with Multiple Treatments: A Study of Juvenile Myositis 2009 Peter A. Lachenbruch
Frederick W. Miller
Lisa G. Rider
+ A Practical Application of a Simple Bootstrapping Method for Assessing Predictors Selected for Epidemiologic Risk Models Using Automated Variable Selection 2017 Haider Mannan
+ Performance of using multiple stepwise algorithms for variable selection 2010 Ryan E. Wiegand
+ The large-sample performance of backwards variable elimination 2008 Peter C. Austin
+ Using the bootstrap to improve estimation and confidence intervals for regression coefficients selected using backwards variable elimination 2007 Peter C. Austin
+ Bootstrap Methods for Developing Predictive Models 2004 Peter C. Austin
Jack V. Tu
+ Bootstrap model selection had similar performance for selecting authentic and noise variables compared to backward variable elimination: a simulation study 2008 Peter C. Austin
+ Automated variable selection methods for logistic regression produced unstable models for predicting acute myocardial infarction mortality 2004 Peter C. Austin
Jack V. Tu
+ PDF Chat Determining relative importance of variables in developing and validating predictive models 2009 Joseph Beyene
Eshetu G. Atenafu
Jemila S. Hamid
Teresa To
Lillian Sung
+ Bayesian Analysis of Two-Level Fractional Factorial Experiments with Non-Normal Responses 2013 Jianjun Wang
Yizhong Ma