Tuning Parameter Selection in High Dimensional Penalized Likelihood

Yingying Fan; Cheng Yong Tang

doi:10.1111/rssb.12001

journal article Dec 04, 2012

Tuning Parameter Selection in High Dimensional Penalized Likelihood

Yingying Fan

Cheng Yong Tang

Journal of the Royal Statistical Society Series B: Statistical Methodology Vol. 75 No. 3 pp. 531-552 · Oxford University Press (OUP)

View at Publisher Save 10.1111/rssb.12001

Abstract

SummaryDetermining how to select the tuning parameter appropriately is essential in penalized likelihood methods for high dimensional data analysis. We examine this problem in the setting of penalized likelihood methods for generalized linear models, where the dimensionality of covariates p is allowed to increase exponentially with the sample size n. We propose to select the tuning parameter by optimizing the generalized information criterion with an appropriate model complexity penalty. To ensure that we consistently identify the true model, a range for the model complexity penalty is identified in the generlized information criterion. We find that this model complexity penalty should diverge at the rate of some power of log (p) depending on the tail probability behaviour of the response variables. This reveals that using the Akaike information criterion or Bayes information criterion to select the tuning parameter may not be adequate for consistently identifying the true model. On the basis of our theoretical study, we propose a uniform choice of the model complexity penalty and show that the approach proposed consistently identifies the true model among candidate models with asymptotic probability 1. We justify the performance of the procedure proposed by numerical simulations and a gene expression data analysis.

Topics

No keywords indexed for this article. Browse by subject →

References

34

[1]

Akaike (1973)

[2]

Bai "Model selection with data-oriented penalty" J. Statist. Planng Inf. (1999) 10.1016/s0378-3758(98)00168-2

[3]

Coordinate descent algorithms for nonconvex penalized regression, with applications to biological feature selection

Patrick Breheny, Jian Huang

The Annals of Applied Statistics 2011 10.1214/10-aoas388

[4]

Statistics for High-Dimensional Data

Peter Bühlmann, Sara van de Geer

Springer Series in Statistics 2011 10.1007/978-3-642-20192-9

[5]

Extended Bayesian information criteria for model selection with large model spaces

J. Chen, Z. Chen

Biometrika 2008 10.1093/biomet/asn034

[6]

De La Peña "Bounds on the tail probability of U-statistics and quadratic forms" Bull. Am. Math. Soc. (1994) 10.1090/s0273-0979-1994-00522-1

[7]

Comparison of Discrimination Methods for the Classification of Tumors Using Gene Expression Data

Sandrine Dudoit, Jane Fridlyand, Terence P Speed

Journal of the American Statistical Association 2002 10.1198/016214502753479248

[8]

Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties

Jianqing Fan, Runze Li

Journal of the American Statistical Association 2001 10.1198/016214501753382273

[9]

Fan "A selective overview of variable selection in high dimensional feature space" Statist. Sin. (2010)

[10]

Fan "Non-concave penalized likelihood with np-dimensionality" IEEE Trans. Inform. Theor. (2011) 10.1109/tit.2011.2158486

[11]

Fan "Sure independence screening in generalized linear models with NP-dimensionality" Ann. Statist. (2010) 10.1214/10-aos798

[12]

Friedman "Regularization paths for generalized linear models via coordinate descent" J. Statist. Softwr. (2010)

[13]

van de Geer "M-estimation using penalties or sieves" J. Statist. Planng Inf. (2002) 10.1016/s0378-3758(02)00270-7

[14]

Golub "Molecular classification of cancer: class discovery and class prediction by gene expression monitoring" Science (1999) 10.1126/science.286.5439.531

[15]

Hastie (2009)

[16]

van der Hilst "Seismo-stratigraphy and thermal structure of earth’s core-mantle boundary region" Science (2007) 10.1126/science.1137867

[17]

Jagannathan "Risk reduction in large portfolios: why imposing the wrong constraints helps" J. Finan. (2003) 10.1111/1540-6261.00580

[18]

Lv "A unified approach to model selection and sparse recovery using regularized least squares" Ann. Statist. (2009) 10.1214/09-aos683

[19]

Lv "Model selection principles in misspecified models" Manuscript (2010)

[20]

Generalized Linear Models

P. McCullagh, J. A. Nelder

1989 10.1007/978-1-4899-3242-6

[21]

Nishii "Asymptotic properties of criteria for selection of variables in multiple regression" Ann. Statist. (1984) 10.1214/aos/1176346522

[22]

Estimating the Dimension of a Model

Gideon Schwarz

The Annals of Statistics 1978 10.1214/aos/1176344136

[23]

Shao "An asymptotic theory for linear model selection" Statist. Sin. (1997)

[24]

Regression Shrinkage and Selection Via the Lasso

Robert Tibshirani

Journal of the Royal Statistical Society Series B:... 1996 10.1111/j.2517-6161.1996.tb02080.x

[25]

Wang "Forward regression for ultra-high dimensional variable screening" J. Am. Statist. Ass. (2009) 10.1198/jasa.2008.tm08516

[26]

Wang "Shrinkage tuning parameter selection with a diverging number of parameters" J. R. Statist. Soc. B (2009) 10.1111/j.1467-9868.2008.00693.x

[27]

Wang "Tuning parameter selectors for the smoothly clipped absolute deviation method" Biometrika (2007) 10.1093/biomet/asm053

[28]

Wang "Consistent tuning parameter selection in high dimensional sparse linear regression" J. Multiv. Anal. (2011) 10.1016/j.jmva.2011.03.007

[29]

Yang "Can the strengths of aic and bic be shared?: a conflict between model identification and regression estimation" Biometrika (2005) 10.1093/biomet/92.4.937

[30]

Zhang "Nearly unbiased variable selection under minimax concave penalty" Ann. Statist. (2010) 10.1214/09-aos729

[31]

Zhang "The sparsity and bias of the Lasso selection in high-dimensional linear regression" Ann. Statist. (2006)

[32]

Zhang "Regularization parameter selections via generalized information criterion" J. Am. Statist. Ass. (2010) 10.1198/jasa.2009.tm08013

[33]

Zhao "On model selection consistency of Lasso" J. Mach. Learn. Res. (2006)

[34]

Zou "One-step sparse estimates in nonconcave penalized likelihood models (with discussion)" Ann. Statist. (2008)

Cited By

238

Row‐Wise Fusion Regularization: An Interpretable Personalized Federated Learning Framework in Large‐Scale Scenarios

Runlin Zhou, Letian Li · 2026

Stat

Generalized information criteria for high-dimensional sparse statistical jump models

Federico P. Cortese, Petter N. Kolm · 2026

AStA Advances in Statistical Analys...

Statistical jump model for mixed-type data with missing data imputation

Federico P. Cortese, Antonio Pievatolo · 2025

Advances in Data Analysis and Class...

What drives cryptocurrency returns? A sparse statistical jump model approach

Federico P. Cortese, Petter N. Kolm · 2023

Digital Finance

Linearized maximum rank correlation estimation

Guohao Shen, Kani Chen · 2022

Biometrika

Model Selection for High-Dimensional Quadratic Regression via Regularization

Ning Hao, Yang Feng · 2018

Journal of the American Statistical...

From big data analysis to personalized medicine for all: challenges and opportunities

Akram Alyass, Michelle Turcotte · 2015

BMC Medical Genomics

Metrics

238

Citations

34

References

Details

Published: Dec 04, 2012
Vol/Issue: 75(3)
Pages: 531-552
License: View

Authors

Y

Yingying Fan

University of Southern California , Los Angeles , USA

C

Cheng Yong Tang

University of Colorado, Denver , USA

Funding

University of Southern California

National University of Singapore

National Science Foundation ‘Career’ Award: DMS-1150318

Risk Management Institute

Cite This Article

Yingying Fan, Cheng Yong Tang (2012). Tuning Parameter Selection in High Dimensional Penalized Likelihood. Journal of the Royal Statistical Society Series B: Statistical Methodology, 75(3), 531-552. https://doi.org/10.1111/rssb.12001

Tuning Parameter Selection in High Dimensional Penalized Likelihood

You May Also Like