Skip to main content

Advertisement

Log in

Group penalized quantile regression

  • Review Paper
  • Published:
Statistical Methods & Applications Aims and scope Submit manuscript

Abstract

Quantile regression models have become a widely used statistical tool in genetics and in the omics fields because they can provide a rich description of the predictors’ effects on an outcome without imposing stringent parametric assumptions on the outcome-predictors relationship. This work considers the problem of selecting grouped variables in high-dimensional linear quantile regression models. We introduce a group penalized pseudo quantile regression (GPQR) framework with both group-lasso and group non-convex penalties. We approximate the quantile regression check function using a pseudo-quantile check function. Then, using the majorization–minimization principle, we derive a simple and computationally efficient group-wise descent algorithm to solve group penalized quantile regression. We establish the convergence rate property of our algorithm with the group-Lasso penalty and illustrate the GPQR approach performance using simulations in high-dimensional settings. Furthermore, we demonstrate the use of the GPQR method in a gene-based association analysis of data from the Alzheimer’s Disease Neuroimaging Initiative study and in an epigenetic analysis of DNA methylation data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  • Alhamzawi R, Yu K, Benoit DF (2012) Bayesian adaptive lasso quantile regression. Stat Modell 12(3):279–297

    Article  MathSciNet  MATH  Google Scholar 

  • Aravkin AY, Kambadur A, Lozano AC, Luss R (2014) Sparse quantile huber regression for efficient and robust estimation. arXiv preprint arXiv:1402.4624

  • Belloni A, Chernozhukov V et al (2011) l1-penalized quantile regression in high-dimensional sparse models. Ann Stat 39(1):82–130

    Article  MATH  Google Scholar 

  • Belloni A, Chernozhukov V, Wang L (2011) Square-root lasso: pivotal recovery of sparse signals via conic programming. Biometrika 98(4):791–806

    Article  MathSciNet  MATH  Google Scholar 

  • Bickel PJ, Ritov Y, Tsybakov AB et al (2009) Simultaneous analysis of lasso and dantzig selector. Ann Stat 37(4):1705–1732

    Article  MathSciNet  MATH  Google Scholar 

  • Bondell HD, Reich BJ, Wang H (2010) Noncrossing quantile regression curve estimation. Biometrika 97(4):825–838

    Article  MathSciNet  MATH  Google Scholar 

  • Breheny P (2015) grpreg: regularization paths for regression models with grouped covariates. R Package Version 2:1–8

    Google Scholar 

  • Breheny P, Huang J (2011) Coordinate descent algorithms for nonconvex penalized regression, with applications to biological feature selection. Ann Appl Stat 5(1):232

    Article  MathSciNet  MATH  Google Scholar 

  • Breheny P, Huang J (2015) Group descent algorithms for nonconvex penalized linear and logistic regression models with grouped predictors. Stat Comput 25(2):173–187

    Article  MathSciNet  MATH  Google Scholar 

  • Briollais L, Durrieu G (2014) Application of quantile regression to recent genetic and-omic studies. Hum Genet 133(8):951–966

    Article  Google Scholar 

  • Ciuperca G (2019) Adaptive group lasso selection in quantile models. Stat Pap 60(1):173–197

    Article  MathSciNet  MATH  Google Scholar 

  • Durinck S, Spellman PT, Birney E, Huber W (2009) Mapping identifiers for the integration of genomic datasets with the r/bioconductor package biomart. Nat Protoc 4(8):1184

    Article  Google Scholar 

  • Efron B, Hastie T, Tibshirani R (2007) Discussion: the dantzig selector: statistical estimation when p is much larger than n. Ann Stat 35(6):2358–2364

    Article  Google Scholar 

  • Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. J Am Stat Assoc 96(456):1348–1360

    Article  MathSciNet  MATH  Google Scholar 

  • Fan J, Fan Y, Barut E (2014) Adaptive robust variable selection. Ann Stat 42(1):324

    Article  MathSciNet  MATH  Google Scholar 

  • Fan J, Xue L, Zou H (2014) Strong oracle optimality of folded concave penalized estimation. Ann Stat 42(3):819

    MathSciNet  MATH  Google Scholar 

  • Fenske N, Kneib T, Hothorn T (2011) Identifying risk factors for severe childhood malnutrition by boosting additive quantile regression. J Am Stat Assoc 106(494):494–510

    Article  MathSciNet  MATH  Google Scholar 

  • Friedman J, Hastie T, Tibshirani R (2010) A note on the group lasso and a sparse group lasso. arXiv preprint arXiv:1001.0736

  • Gu Y, Zou H et al (2016) High-dimensional generalizations of asymmetric least squares regression and their applications. Ann Stat 44(6):2661–2694

    Article  MathSciNet  MATH  Google Scholar 

  • Hashem H, Vinciotti V, Alhamzawi R, Yu K (2016) Quantile regression with group lasso for classification. Adv Data Anal Classif 10(3):375–390

    Article  MathSciNet  MATH  Google Scholar 

  • Hertz JM, Schell G, Doerfler W (1999) Factors affecting de novo methylation of foreign DNA in mouse embryonic stem cells. J Biol Chem 274(34):24232–24240

    Article  Google Scholar 

  • Hofner B, Mayr A, Robinzonov N, Schmid M (2014) Model-based boosting in R: a hands-on tutorial using the R package mboost. Comput Stat 29(1–2):3–35

    Article  MathSciNet  MATH  Google Scholar 

  • Hohman TJ, Koran MEI, Thornton-Wells TA (2014) Genetic modification of the relationship between phosphorylated tau and neurodegeneration. Alzheimer’s & dementia J Alzheimer’s Assoc 10(6):637–645

    Article  Google Scholar 

  • Hunter DR, Lange K (2000) Quantile regression via an MM algorithm. J Comput Gr Stat 9(1):60–77

    MathSciNet  Google Scholar 

  • Hunter DR, Lange K (2004) A tutorial on MM algorithms. Am Stat 58(1):30–37

    Article  MathSciNet  Google Scholar 

  • Jennings L, Wong K, Teo K (1996) Optimal control computation to account for eccentric movement. ANZIAM J 38(2):182–193

    MathSciNet  MATH  Google Scholar 

  • Ji Y, Lin N, Zhang B (2012) Model selection in binary and tobit quantile regression using the Gibbs sampler. Comput Stat Data Anal 56(4):827–839

    Article  MathSciNet  MATH  Google Scholar 

  • Juban R, Ohlsson H, Maasoumy M, Poirier L, Kolter JZ (2016) A multiple quantile regression approach to the wind, solar, and price tracks of gefcom2014. Int J Forecast 32(3):1094–1102

    Article  Google Scholar 

  • Kato K (2011) Group lasso for high dimensional sparse quantile regression models. arXiv preprint arXiv:1103.1458

  • Kim S, Swaminathan S, Shen L, Risacher S, Nho K, Foroud T, Shaw L, Trojanowski J, Potkin S, Huentelman M et al (2011) Genome-wide association study of CSF biomarkers a\(\beta\)1-42, t-tau, and p-tau181p in the ADNI cohort. Neurology 76(1):69–79

    Article  Google Scholar 

  • Koenker R (1984) A note on l-estimates for linear models. Stat Prob Lett 2(6):323–325

    Article  MathSciNet  MATH  Google Scholar 

  • Koenker R (2004) Quantile regression for longitudinal data. J Multivar Anal 91(1):74–89

    Article  MathSciNet  MATH  Google Scholar 

  • Koenker R, Bassett G Jr (1978) Regression quantiles. Econometrica 46(1):33–50

  • Koenker R, Hallock KF (2001) Quantile regression. J Econ Perspect 15(4):143–156

    Article  Google Scholar 

  • Kozumi H, Kobayashi G (2011) Gibbs sampling methods for Bayesian quantile regression. J Stat Comput Simul 81(11):1565–1578

    Article  MathSciNet  MATH  Google Scholar 

  • Lakhal-Chaieb L, Greenwood CM, Ouhourane M, Zhao K, Abdous B, Oualkacha K (2017) A smoothed EM-algorithm for DNA methylation profiles from sequencing-based methods in cell lines or for a single cell type. Stat Appl Genet Mol Biol 16(5–6):333–347

    MathSciNet  MATH  Google Scholar 

  • Lange K, Papp JC, Sinsheimer JS, Sobel EM (2014) Next-generation statistical genetics: modeling, penalization, and optimization in high-dimensional data. Annu Rev Stat Appl 1(1):279–300

    Article  Google Scholar 

  • Li Y, Zhu J (2008) L 1-norm quantile regression. J Comput Gr Stat 17(1):163–185

    Article  MathSciNet  Google Scholar 

  • Li J, Zhang Q, Chen F, Meng X, Liu W, Chen D, Yan J, Kim S, Wang L, Feng W et al (2017) Genome-wide association and interaction studies of CSF t-tau/a\(\beta\)42 ratio in ADNI cohort. Neurobiol Aging 57:247-e1

    Article  Google Scholar 

  • Liu Y, Wu Y (2009) Stepwise multiple quantile regression estimation using non-crossing constraints. Stat Interface 2(3):299–310

    Article  MathSciNet  MATH  Google Scholar 

  • Mayr A, Binder H, Gefeller O, Schmid M (2014) The evolution of boosting algorithms-from machine learning to statistical modelling. arXiv preprint arXiv:1403.1452

  • Meier L, Van De Geer S, Bühlmann P (2008) The group lasso for logistic regression. J R Stat Soc Ser B (Stat Methodol) 70(1):53–71

    Article  MathSciNet  MATH  Google Scholar 

  • Mkhadri A, Ouhourane M (2013) An extended variable inclusion and shrinkage algorithm for correlated variables. Comput Stat Data Anal 57(1):631–644

    Article  MathSciNet  MATH  Google Scholar 

  • Mkhadri A, Ouhourane M, Oualkacha K (2017) A coordinate descent algorithm for computing penalized smooth quantile regression. Stat Comput 27(4):865–883

    Article  MathSciNet  MATH  Google Scholar 

  • Ogutu JO, Piepho H-P (2014) Regularized group regression methods for genomic prediction: Bridge, MCP, SCAD, group bridge, group lasso, sparse group lasso, group MCP and group SCAD. BMC Proc 8(Suppl 5):S7

  • Oh H-S, Lee TC, Nychka DW (2011) Fast nonparametric quantile regression with arbitrary smoothing methods. J Comput Gr Stat 20(2):510–526

    Article  MathSciNet  Google Scholar 

  • Peng B, Wang L (2015) An iterative coordinate descent algorithm for high-dimensional nonconvex penalized quantile regression. J Comput Gr Stat 24(3):676–694

    Article  MathSciNet  Google Scholar 

  • Roberts S, Nowak G (2014) Stabilizing the lasso against cross-validation variability. Comput Stat Data Anal 70:198–211

    Article  MathSciNet  MATH  Google Scholar 

  • Simon N, Friedman J, Hastie T, Tibshirani R (2013) A sparse-group lasso. J Comput Gr Stat 22(2):231–245

    Article  MathSciNet  Google Scholar 

  • Tibshirani R (1996) Regression shrinkage and selection via the lasso. J Roy Stat Soc B 58(1):267–288

  • Tibshirani R, Bien J, Friedman J, Hastie T, Simon N, Taylor J, Tibshirani RJ (2012) Strong rules for discarding predictors in lasso-type problems. J R Stat Soc Ser B (Stat Methodol) 74(2):245–266

    Article  MathSciNet  MATH  Google Scholar 

  • Turgeon M, Oualkacha K, Ciampi A, Miftah H, Dehghan G, Zanke BW, Benedet AL, Rosa-Neto P, Greenwood CM, Labbe A; Alzheimer’s Disease Neuroimaging Initiative (2018) Principal component of explained variance: an efficient and optimal data dimension reduction framework for association studies. Stat Methods Med Res 27(5):1331–1350. https://doi.org/10.1177/0962280216660128

  • Waldmann E, Kneib T, Yue YR, Lang S, Flexeder C (2013) Bayesian semiparametric additive quantile regression. Stat Modell 13(3):223–252

    Article  MathSciNet  MATH  Google Scholar 

  • Wang L (2013) The l1 penalized LAD estimator for high dimensional linear regression. J Multivar Anal 120:135–151

    Article  MATH  Google Scholar 

  • Wang L, Wu Y, Li R (2012) Quantile regression for analyzing heterogeneity in ultra-high dimension. J Am Stat Assoc 107(497):214–222

    Article  MathSciNet  MATH  Google Scholar 

  • Wang H, Lengerich BJ, Aragam B, Xing EP (2019) Precision lasso: accounting for correlations and linear dependencies in high-dimensional genomic data. Bioinformatics 35(7):1181–1187

    Article  Google Scholar 

  • Wei F, Zhu H (2012) Group coordinate descent algorithms for nonconvex penalized regression. Comput Stat Data Anal 56(2):316–326

    Article  MathSciNet  MATH  Google Scholar 

  • Wu TT, Lange K et al (2008) Coordinate descent algorithms for lasso penalized regression. Ann Appl Stat 2(1):224–244

    Article  MathSciNet  MATH  Google Scholar 

  • Xu QF, Ding XH, Jiang CX, Yu KM, Shi L (2020) An elastic-net penalized expectile regression with applications. J Appl Stat. https://doi.org/10.1080/02664763.2020.1787355

  • Yang Y, Zou H (2013) An efficient algorithm for computing the HHSVM and its generalizations. J Comput Gr Stat 22(2):396–415

    Article  MathSciNet  Google Scholar 

  • Yang Y, Zou H (2015) A fast unified algorithm for solving group-lasso penalize learning problems. Stat Comput 25(6):1129–1141

    Article  MathSciNet  MATH  Google Scholar 

  • Yi C, Huang J (2017) Semismooth Newton coordinate descent algorithm for elastic-net penalized Huber loss regression and quantile regression. J Comput Gr Stat 26(3):547–557

    Article  MathSciNet  Google Scholar 

  • Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. J R Stat Soc Ser B (Stat Methodol) 68(1):49–67

    Article  MathSciNet  MATH  Google Scholar 

  • Zhang C-H et al (2010) Nearly unbiased variable selection under minimax concave penalty. Ann Stat 38(2):894–942

    Article  MathSciNet  MATH  Google Scholar 

  • Zhao G, Teo KL, Chan K (2005) Estimation of conditional quantiles by a new smoothing approximation of asymmetric loss functions. Stat Comput 15(1):5–11

    Article  MathSciNet  Google Scholar 

  • Zhou H, Alexander DH, Sehl ME, Sinsheimer JS, Sobel EM, Lange K (2011) Penalized regression for genome-wide association screening of sequence data. Pac Symp Biocomput 2011:106–117. https://doi.org/10.1142/9789814335058_0012. PMID: 21121038; PMCID: PMC5049883

  • Zou H, Li R (2008) One-step sparse estimates in nonconcave penalized likelihood models. Ann Stat 36(4):1509

    MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

This work is supported by the Natural Sciences and Engineering Research Council of Canada through an individual discovery research grant to Karim Oualkacha and by the Fonds de recherche du Québec-Santé through individual Grant #31110 to Karim Oualkacha. Data collection and sharing for this project was funded by the Alzheimer’s Disease Neuroimaging Initiative (ADNI) (National Institutes of Health Grant U01 AG024904) and DOD ADNI (Department of Defense Award Number W81XWH-12-2-0012). The ADNI is funded by the National Institute on Aging, the National Institute of Biomedical Imaging and Bioengineering, and through generous contributions from: the AbbVie, Alzheimer’s Association; Alzheimer’s Drug Discovery Foundation; Araclon Biotech; BioClinica, Inc.; Biogen; Bristol-Myers Squibb Company; CereSpir, Inc.; Cogstate; Eisai Inc.; Elan Pharmaceuticals, Inc.; Eli Lilly and Company; EuroImmun; F. Hoffmann-La Roche Ltd and its affiliated company Genentech, Inc.; Fujirebio; GE Healthcare; IXICO Ltd.; Janssen Alzheimer Immunotherapy Research and Development, LLC.; Johnson and Johnson Pharmaceutical Research and Development LLC.; Lumosity; Lundbeck; Merck and Co., Inc.; Meso Scale Diagnostics, LLC.; NeuroRx Research; Neurotrack Technologies; Novartis Pharmaceuticals Corporation; Pfizer Inc.; Piramal Imaging; Servier; Takeda Pharmaceutical Company; and Transition Therapeutics. The Canadian Institute of Health Research provides funds to support ADNI clinical sites in Canada. Private sector contributions are facilitated by the Foundation for the National Institutes of Health (www.fnih.org). The grantee organization is the Northern California Institute for Research and Education, and the study is coordinated by the Alzheimer’s Disease Cooperative Study at the University of California, San Diego. ADNI data are disseminated by the Laboratory for Neuro Imaging at the University of Southern California.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mohamed Ouhourane.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

For the Alzheimer’s Disease Neuroimaging Initiative.

Supplementary Information

This document includes proofs of Propositions 1, 2, and 3, and Theorem 1 of the main manuscript. It also contains the theoretical and numerical developments of the KKT conditions, three additional illustrative figures, and one table from the results of the analysis of the ADNI data using our GPQR approach. (.pdf file). Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 481 KB)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ouhourane, M., Yang, Y., Benedet, A.L. et al. Group penalized quantile regression. Stat Methods Appl 31, 495–529 (2022). https://doi.org/10.1007/s10260-021-00580-8

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10260-021-00580-8

Keywords

Navigation