A generalized likelihood-based Bayesian approach for scalable joint regression and covariance selection in high dimensions

Samanta, Srijata; Khare, Kshitij; Michailidis, George

doi:10.1007/s11222-022-10102-5

A generalized likelihood-based Bayesian approach for scalable joint regression and covariance selection in high dimensions

Published: 03 June 2022

Volume 32, article number 47, (2022)
Cite this article

Statistics and Computing Aims and scope Submit manuscript

374 Accesses
3 Citations
1 Altmetric
Explore all metrics

Abstract

The paper addresses joint sparsity selection in the regression coefficient matrix and the error precision (inverse covariance) matrix for high-dimensional multivariate regression models in the Bayesian paradigm. The selected sparsity patterns are crucial to help understand the network of relationships between the predictor and response variables, as well as the conditional relationships among the latter. While Bayesian methods have the advantage of providing natural uncertainty quantification through posterior inclusion probabilities and credible intervals, current Bayesian approaches either restrict to specific sub-classes of sparsity patterns and/or are not scalable to settings with hundreds of responses and predictors. Bayesian approaches that only focus on estimating the posterior mode are scalable, but do not generate samples from the posterior distribution for uncertainty quantification. Using a bi-convex regression-based generalized likelihood and spike-and-slab priors, we develop an algorithm called joint regression network selector (JRNS) for joint regression and covariance selection, which (a) can accommodate general sparsity patterns, (b) provides posterior samples for uncertainty quantification, and (c) is scalable and orders of magnitude faster than the state-of-the-art Bayesian approaches providing uncertainty quantification. We demonstrate the statistical and computational efficacy of the proposed approach on synthetic data and through the analysis of selected cancer data sets. We also establish high-dimensional posterior consistency for one of the developed algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A two-stage sequential conditional selection approach to sparse high-dimensional multivariate regression models

Article 23 August 2018

Global–local shrinkage multivariate logit-beta priors for multiple response-type data

Article 03 February 2024

Laplace Approximation in High-Dimensional Bayesian Regression

References

Alquier, P.: Approximate bayesian inference. Entropy (Basel) 22, 1272 (2020)
Barbieri, M.M., Berger, J.O.: Optimal predictive model selection. Ann. Stat. 32(3), 870–897 (2004)
Article MathSciNet Google Scholar
Besag, J.: Statistical analysis of non-lattice data. J. R. Stat. Soc.: Ser. D (The Statistician) 24(3), 179–195 (1975)
Google Scholar
Bhadra, A., Mallick, B.K.: Joint high-dimensional Bayesian variable and covariance selection with an application to eQTL analysis. Biometrics 69(2), 447–457 (2013)
Article MathSciNet Google Scholar
Bissiri, P., Holmes, C., Walker, S.: A general framework for updating belief distributions. J. R. Stat. Soc. Ser. B 78, 1103–1130 (2016)
Article MathSciNet Google Scholar
Brown, P.J., Vannucci, M., Fearn, T.: Multivariate Bayesian variable selection and prediction. J. R. Stat. Soc. Ser. B 60, 627–641 (1998)
Article MathSciNet Google Scholar
Cai, T.T., Li, H., Liu, W., Xie, J.: Covariate-adjusted precision matrix estimation with an application in genetical genomics. Biometrika 100(1), 139–156 (2013)
Article MathSciNet Google Scholar
Cao, X., Khare, K., Ghosh, M.: Posterior graph selection and estimation consistency for high-dimensional bayesian dag models. Ann. Stat. 47(1), 319–348 (2019)
Article MathSciNet Google Scholar
Consonni, G., La Rocca, L., Peluso, S.: Objective Bayes covariate-adjusted sparse graphical model selection. Scand. J. Stat. 44, 741–764 (2017)
Article MathSciNet Google Scholar
Deshpande, S.K., Ročková, V., George, E.I.: Simultaneous variable and covariance selection with the multivariate spike-and-slab lasso. J. Comput. Gr. Stat. 28(4), 921–931 (2019)
Article MathSciNet Google Scholar
Friedman, J., Hastie, T., Tibshirani, R.: Sparse inverse covariance estimation with the graphical lasso. Biostatistics 9(3), 432–441 (2008)
Article Google Scholar
Van de Geer, S., Bühlmann, P., Ritov, Y., Dezeure, R.: On asymptotically optimal confidence regions and tests for high-dimensional models. Ann. Stat. 42(3), 1166–1202 (2014)
MathSciNet MATH Google Scholar
Gonzalez, D.M., Medici, D.: Signaling mechanisms of the epithelial-mesenchymal transition. Sci. Signal. 7(344), 8 (2014)
Article Google Scholar
Ha, MJ., Stingo, F., Baladandayuthapani, V.: Supplemental material for ‘Bayesian Structure Learning in Multi-layered Genomic Networks’. Github (2020a)
Ha, M.J., Stingo, F.C., Baladandayuthapani, V.: Bayesian structure learning in multi-layered genomic networks. J. Am. Stat. Assoc. 1, 1–33 (2020b)
Google Scholar
Khare, K., Oh, S., Rajaratnam, B.: A convex pseudolikelihood framework for high dimensional partial correlation estimation with convergence guarantees. J. R. Stat. Soc. B 77, 803–825 (2015)
Lee, K., Lee, K., Lee, J.: Post-processed posteriors for banded covariances. arXiv preprint arXiv:2011.12627 (2020)
Lee, W., Liu, Y.: Simultaneous multiple response regression and inverse covariance matrix estimation via penalized gaussian maximum likelihood. J. Multivar. Anal. 111, 241–255 (2012)
Article MathSciNet Google Scholar
Li, Y., Datta, J., Craig, B.A., Bhadra, A.: Joint mean-covariance estimation via the horseshoe. J. Multivar. Anal. 183, 104716 (2021)
Article MathSciNet Google Scholar
Lin, J., Basu, S., Banerjee, M., Michailidis, G.: Penalized maximum likelihood estimation of multi-layered gaussian graphical models. J. Mach. Learn. Res. 17, 1–51 (2016)
MathSciNet MATH Google Scholar
Lin, L., Drton, M., Shojaie, A.: High-dimensional inference of graphical models using regularized score matching. Electron. J. Stat. 10(1), 394–422 (2016)
Article MathSciNet Google Scholar
Ma, J., Michailidis, G.: Joint structural estimation of multiple graphical models. J. Mach. Learn. Res. 17(166), 1–48 (2016)
MathSciNet MATH Google Scholar
McCarter, C., Kim, S.: (2014) On sparse gaussian chain graph models. In: Advances in Neural Information Processing Systems, pp. 3212–3220
Meinshausen, N., Bühlmann, P.: High-dimensional graphs and variable selection with the lasso. Ann. Stat. 1, 1436–1462 (2006)
MathSciNet MATH Google Scholar
Narisetty, N., He, X.: Bayesian variable selection with shrinking and diffusing priors. Ann. Stat. 42, 789–817 (2014)
Peng, J., Wang, P., Zhou, N., Zhu, J.: Partial correlation estimation by joint sparse regression models. J. Am. Stat. Assoc. 104, 735–746 (2009)
Richardson, S., Bottolo, L., Rosenthal, J.S.: Bayesian models for sparse regression analysis of high dimensional data. In: Bernardo, J., Bayarri, M., Berger, J., Dawid, A., Heckerman, D., Smith, A.F.M., West, M. (Eds.) Bayesian Statistics 9 (2010)
Rothman, A.J., Levina, E., Zhu, J.: Sparse multivariate regression with covariance estimation. J. Comput. Gr. Stat. 19(4), 947–962 (2010)
Article MathSciNet Google Scholar
Sohn, KA., Kim, S.: (2012) Joint estimation of structured sparsity and output structure in multiple-output regression via inverse-covariance regularization. In: International Conference on Artificial Intelligence and Statistics, pp 1081–1089
Wang, H.: Bayesian graphical lasso models and efficient posterior computation. Bayesian Anal. 7(4), 867–886 (2012)
Article MathSciNet Google Scholar
Yuan, X.T., Zhang, T.: Partial gaussian graphical model estimation. IEEE Trans. Inf. Theory 60(3), 1673–1687 (2014)
Article MathSciNet Google Scholar

Download references

Acknowledgements

The work of GM was supported in part by NIH grants 1U01CA235489-01 and 1R01GM114029-01A1.

Author information

Authors and Affiliations

Department of Statistics, U Florida, Gainesville, USA
Srijata Samanta, Kshitij Khare & George Michailidis

Authors

Srijata Samanta
View author publications
You can also search for this author in PubMed Google Scholar
Kshitij Khare
View author publications
You can also search for this author in PubMed Google Scholar
George Michailidis
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to George Michailidis.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file 1 (pdf 2254 KB)

Appendix: Pathways for TCGA cancer data

Table 13 lists the indices of all the genes and proteins in the LUAD cancer data, and Table 14 lists all the pathways that have been considered in the analysis of the TCGA cancer data in Sect. 5 and their gene members.

Table 13 Indices of genes and proteins for LUAD lung cancer data. The first column lists the components of the dataset mRNA(genes), and the second column lists the components of the dataset RPPA(proteins)

Full size table

Table 14 Pathways and gene membership

Full size table

Rights and permissions

Reprints and permissions

About this article

Cite this article

Samanta, S., Khare, K. & Michailidis, G. A generalized likelihood-based Bayesian approach for scalable joint regression and covariance selection in high dimensions. Stat Comput 32, 47 (2022). https://doi.org/10.1007/s11222-022-10102-5

Download citation

Received: 04 June 2021
Accepted: 27 April 2022
Published: 03 June 2022
DOI: https://doi.org/10.1007/s11222-022-10102-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A generalized likelihood-based Bayesian approach for scalable joint regression and covariance selection in high dimensions

Abstract

Access this article

Similar content being viewed by others

A two-stage sequential conditional selection approach to sparse high-dimensional multivariate regression models

Global–local shrinkage multivariate logit-beta priors for multiple response-type data

Laplace Approximation in High-Dimensional Bayesian Regression

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Supplementary Information

Supplementary file 1 (pdf 2254 KB)

Appendix: Pathways for TCGA cancer data

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A generalized likelihood-based Bayesian approach for scalable joint regression and covariance selection in high dimensions

Abstract

Access this article

Similar content being viewed by others

A two-stage sequential conditional selection approach to sparse high-dimensional multivariate regression models

Global–local shrinkage multivariate logit-beta priors for multiple response-type data

Laplace Approximation in High-Dimensional Bayesian Regression

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Supplementary Information

Supplementary file 1 (pdf 2254 KB)

Appendix: Pathways for TCGA cancer data

Appendix: Pathways for TCGA cancer data

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation