Weighted Lasso estimates for sparse logistic regression: non-asymptotic properties with measurement errors

Huang, Huamei; Gao, Yujing; Zhang, Huiming; Li, Bo

doi:10.1007/s10473-021-0112-6

Weighted Lasso estimates for sparse logistic regression: non-asymptotic properties with measurement errors

Published: 24 December 2020

Volume 41, pages 207–230, (2021)
Cite this article

Acta Mathematica Scientia Aims and scope Submit manuscript

Huamei Huang¹^na1,
Yujing Gao²^na1,
Huiming Zhang³^na1 &
…
Bo Li⁴

249 Accesses
9 Citations
Explore all metrics

Abstract

For high-dimensional models with a focus on classification performance, the ℓ₁-penalized logistic regression is becoming important and popular. However, the Lasso estimates could be problematic when penalties of different coefficients are all the same and not related to the data. We propose two types of weighted Lasso estimates, depending upon covariates determined by the McDiarmid inequality. Given sample size n and a dimension of covariates p, the finite sample behavior of our proposed method with a diverging number of predictors is illustrated by non-asymptotic oracle inequalities such as the ℓ₁-estimation error and the squared prediction error of the unknown parameters. We compare the performance of our method with that of former weighted estimates on simulated data, then apply it to do real data analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Penalized robust estimators in sparse logistic regression

Article 12 November 2021

Ana M. Bianco, Graciela Boente & Gonzalo Chebi

Improved Loss Estimation for the Lasso: A Variable Selection Tool

Article 28 March 2015

Rajendran Narayanan & Martin T. Wells

Robust LASSO and Its Applications in Healthcare Data

References

Algamal Z Y, Lee M H. A new adaptive Ll-norm for optimal descriptor selection of high-dimensional QSAR classification model for anti-hepatitis C virus activity of thiourea derivatives. SAR and QSAR in Environmental Research, 2017, 28(1): 75–90
Article Google Scholar
Bickel P J, Ritov Y, Tsybakov A B. Simultaneous analysis of Lasso and Dantzig selector. The Annals of Statistics, 2009, 37(4): 1705–1732
Article MathSciNet Google Scholar
Buhlmann P, Van De Geer S. Statistics for High-Dimensional Data: Methods, Theory and Applications. Springer Science & Business Media, 2011
Boucheron S, Lugosi G, Massart P. Concentration Inequalities: A Nonasymptotic Theory of Independence. Oxford University Press, 2013
Bunea F. Honest variable selection in linear and logistic regression models via l(1) and l(1) + l(2) penalization. Electronic Journal of Statistics, 2008, 2: 1153–1194
Article MathSciNet Google Scholar
Cox D R. The regression analysis of binary sequences (with discussion). Journal of the Royal Statistical Society: Series B (Methodological), 1958, 20(2): 215–232
MathSciNet MATH Google Scholar
Dudoit S, Fridlyand J, Speed T P. Comparison of discrimination methods for the classification of tumors using gene expression data. Journal of the American Statistical Association, 2002, 97(457): 77–87
Article MathSciNet Google Scholar
Efron B, Hastie T. Computer Age Statistical Inference. Cambridge University Press, 2016
Fan Y, Zhang H, Yan T. Asymptotic theory for differentially private generalized β-models with parameters increasing. Statistics and Its Interface, 2020, 13(3): 385–398
Article MathSciNet Google Scholar
Golub T R, Slonim D K, Tamayo P, et al. Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science, 1999, 286(5439): 531–537
Article Google Scholar
Guo P, Zeng F, Hu X, et al. Improved variable selection algorithm using a LASSO-type penalty, with an application to assessing hepatitis B infection relevant factors in community residents. PloS One, 2015, 10(7)
Hastie T, Tibshirani R, Wainwright M. Statistical Learning with Sparsity: the Lasso and Generalizations. CRC Press, 2015
Li W, Lederer J. Tuning parameter calibration for l(1)-regularized logistic regression. Journal of Statistical Planning and Inference, 2019, 202: 80–98
Article MathSciNet Google Scholar
Liu C, San Wong H. Structured penalized logistic regression for gene selection in gene expression data analysis. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2017, 16(1): 312–321
Article Google Scholar
Kwemou M. Non-asymptotic oracle inequalities for the Lasso and group Lasso in high dimensional logistic model. ESAIM: Probability and Statistics, 2016, 20: 309–331
Article MathSciNet Google Scholar
Ma R, Cai T, Li H. Global and simultaneous hypothesis testing for high-dimensional logistic regression models. Journal of the American Statistical Association, 2020: 1–15
Park H, Konishi S. Robust logistic regression modelling via the elastic net-type regularization and tuning parameter selection. Journal of Statistical Computation and Simulation, 2016, 86(7): 1450–1461
Article MathSciNet Google Scholar
Rigollet P, Hütter J C. High Dimensional Statistics. MIT Open CourseWare. 2019. http://www-math.mit.edu/rigollet/PDFs/RigNotes17.pdf
Sur P, Chen Y, Candes E J. The likelihood ratio test in high-dimensional logistic regression is asymptotically a rescaled chi-square. Probability Theory and Related Fields, 2019, 175(1/2): 487–558
Article MathSciNet Google Scholar
Tutz G. Regression for Categorical Data. Cambridge University Press, 2011
Tibshirani R. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological), 1996, 58(1): 267–288
MathSciNet MATH Google Scholar
van de Geer, S. A. High-dimensional generalized linear models and the lasso. The Annals of Statistics, 2008, 36(2): 614–645
Article MathSciNet Google Scholar
Yang X, Zhang H, Wei H, et al. Sparse density estimation with measurement errors. arXiv: 1911.06215, 2019
Yin Z. Variable selection for sparse logistic regression. Metrika, 2020, 83(7): 821–836
Article MathSciNet Google Scholar
Zou H. The adaptive lasso and its oracle properties. Journal of the American statistical association, 2006, 101(476): 1418–1429
Article MathSciNet Google Scholar
Zhang H, Jia J. Elastic-net regularized high-dimensional negative binomial regression: consistency and weak signals detection. Statistica Sinica, 2021
Zhang H. A note on//MLE in logistic regression with a diverging dimension. arXiv: 1801.08898, 2018
Luo J, Qin H, Wang Z. Asymptotic distribution in directed finite weighted random graphs with an increasing Bi-degree sequence. Acta Math Sci, 2020, 40B(2): 355–368
Article MathSciNet Google Scholar

Download references

Author information

Three authors, Huamei Huang, Yujing Gao and Huiming Zhang, are co-first authors contributed equally to this work.

Authors and Affiliations

Department of Statistics and Finance, University of Science and Technology of China, Hefei, 230026, China
Huamei Huang
Guanghua School of Management, Peking University, Beijing, 100871, China
Yujing Gao
School of Mathematical Sciences, Peking University, Beijing, 100871, China
Huiming Zhang
Central China Normal University, School of Mathematics and Statistics, Wuhan, 430079, China
Bo Li

Authors

Huamei Huang
View author publications
You can also search for this author in PubMed Google Scholar
Yujing Gao
View author publications
You can also search for this author in PubMed Google Scholar
Huiming Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Bo Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bo Li.

Additional information

Supported by the National Natural Science Foundation of China (61877023) and the Fundamental Research Funds for the Central Universities (CCNU19TD009).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Huang, H., Gao, Y., Zhang, H. et al. Weighted Lasso estimates for sparse logistic regression: non-asymptotic properties with measurement errors. Acta Math Sci 41, 207–230 (2021). https://doi.org/10.1007/s10473-021-0112-6

Download citation

Received: 06 November 2019
Revised: 17 September 2020
Published: 24 December 2020
Issue Date: January 2021
DOI: https://doi.org/10.1007/s10473-021-0112-6

Key words

2010 MR Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Weighted Lasso estimates for sparse logistic regression: non-asymptotic properties with measurement errors

Abstract

Access this article

Similar content being viewed by others

Penalized robust estimators in sparse logistic regression

Improved Loss Estimation for the Lasso: A Variable Selection Tool

Robust LASSO and Its Applications in Healthcare Data

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Key words

2010 MR Subject Classification

Navigation

Weighted Lasso estimates for sparse logistic regression: non-asymptotic properties with measurement errors

Abstract

Access this article

Similar content being viewed by others

Penalized robust estimators in sparse logistic regression

Improved Loss Estimation for the Lasso: A Variable Selection Tool

Robust LASSO and Its Applications in Healthcare Data

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Key words

2010 MR Subject Classification

Search

Navigation