Skip to content
Licensed Unlicensed Requires Authentication Published by De Gruyter August 1, 2020

Multivariate quasi-beta regression models for continuous bounded data

  • Ricardo R. Petterle ORCID logo EMAIL logo , Wagner H. Bonat ORCID logo , Cassius T. Scarpin , Thaísa Jonasson and Victória Z. C. Borba

Abstract

We propose a multivariate regression model to deal with multiple continuous bounded data. The proposed model is based on second-moment assumptions, only. We adopted the quasi-score and Pearson estimating functions for estimation of the regression and dispersion parameters, respectively. Thus, the proposed approach does not require a multivariate probability distribution for the variable response vector. The multivariate quasi-beta regression model can easily handle multiple continuous bounded outcomes taking into account the correlation between the response variables. Furthermore, the model allows us to analyze continuous bounded data on the interval [0, 1], including zeros and/or ones. Simulation studies were conducted to investigate the behavior of the NORmal To Anything (NORTA) algorithm and to check the properties of the estimating function estimators to deal with multiple correlated response variables generated from marginal beta distributions. The model was motivated by a data set concerning the body fat percentage, which was measured at five regions of the body and represent the response variables. We analyze each response variable separately and compare it with the fit of the multivariate proposed model. The multivariate quasi-beta regression model provides better fit than its univariate counterparts, as well as allows us to measure the correlation between response variables. Finally, we adapted diagnostic tools to the proposed model. In the supplementary material, we provide the data set and R code.


Corresponding author: Ricardo R. Petterle, Department of Integrative Medicine, Federal University of Paraná, Curitiba, Brazil,

Acknowledgments

The authors thank the two referees for their helpful comments and suggestions that greatly improved the presentation of this paper.

  1. Author contribution: All the authors have accepted responsibility for the entire content of this submitted manuscript and approved submission.

  2. Research funding: None declared.

  3. Conflict of interest statement: The authors declare no conflicts of interest regarding this article.

References

1. Ferrari, S, Cribari-Neto, F. Beta regression for modelling rates and proportions. J Appl Stat 2004;31:799–815. https://doi.org/10.1080/0266476042000214501.Search in Google Scholar

2. Barndorff-Nielsen, OE, Jørgensen, B. Some parametric models on the simplex. J Multivariate Anal 1991;39:106–16. https://doi.org/10.1016/0047-259x(91)90008-p.Search in Google Scholar

3. Mitnik, PA, Baek, S. The Kumaraswamy distribution: median-dispersion re-parameterizations for regression modeling and simulation-based estimation. Stat Pap 2013;54:177–92. https://doi.org/10.1007/s00362-011-0417-y.Search in Google Scholar

4. Lemonte, AJ, Bazán, JL. New class of Johnson SB distributions and its associated regression model for rates and proportions. Biom J 2016;58:727–46. https://doi.org/10.1002/bimj.201500030.Search in Google Scholar

5. Mousa, AM, El-Sheikh, AA, Abdel-Fattah, MA. A gamma regression for bounded continuous variables. Adv Appl Stat 2016;49:305. https://doi.org/10.17654/as049040305.Search in Google Scholar

6. Bonat, WH, Petterle, RR, Hinde, J, Demétrio, CG. Flexible quasi-beta regression models for continuous bounded data. Stat Model 2019;19:617–33. https://doi.org/10.1177/1471082x18790847.Search in Google Scholar

7. Hunger, M, Döring, A, Holle, R. Longitudinal beta regression models for analyzing health-related quality of life scores over time. BMC Med Res Methodol 2012;12:144. https://doi.org/10.1186/1471-2288-12-144.Search in Google Scholar

8. Bonat, WH, Ribeiro, PJJr, Shimakura, SE. Bayesian analysis for a class of beta mixed models. Chil J Stat 2015;6:3–13.Search in Google Scholar

9. Song, PXK, Tan, M. Marginal models for longitudinal continuous proportional data. Biometrics 2000;56:496–502. https://doi.org/10.1111/j.0006-341x.2000.00496.x.Search in Google Scholar

10. Bonat, WH, Lopes, JE, Shimakura, SE, Ribeiro, PJJr. Likelihood analysis for a class of simplex mixed models. Chil J Stat 2018;9:3–17.Search in Google Scholar

11. Zhao, W, Lian, H, Bandyopadhyay, D. A partially linear additive model for clustered proportion data. Stat Med 2018;37:1009–30. https://doi.org/10.1002/sim.7573.Search in Google Scholar

12. Zheng, X, Qin, G, Tu, D. A generalized partially linear mean-covariance regression model for longitudinal proportional data, with applications to the analysis of quality of life data from cancer clinical trials. Stat Med 2017;36:1884–94. https://doi.org/10.1002/sim.7240.Search in Google Scholar

13. Petterle, RR, Bonat, WH, Scarpin, CT. Quasi-beta longitudinal regression model applied to water quality index data. J Agric Biol Environ Stat 2019;24:346–68. https://doi.org/10.1007/s13253-019-00360-8.Search in Google Scholar

14. Wang, J, Luo, S. Bayesian multivariate augmented beta rectangular regression models for patient-reported outcomes and survival data. Stat Methods Med Res 2017;26:1684–99. https://doi.org/10.1177/0962280215586010.Search in Google Scholar

15. Verbeke, G, Fieuws, S, Molenberghs, G, Davidian, M. The analysis of multivariate longitudinal data: a review. Stat Methods Med Res 2014;23:42–59. https://doi.org/10.1177/0962280212445834.Search in Google Scholar

16. Lemonte, AJ, Moreno-Arenas, G. On a multivariate regression model for rates and proportions. J Appl Stat 2019;46:1084–106. https://doi.org/10.1080/02664763.2018.1534945.Search in Google Scholar

17. Souza, DF, Moura, FA. Multivariate beta regression with application in small area estimation. J Off Stat 2016;32:747. https://doi.org/10.1515/jos-2016-0038.Search in Google Scholar

18. Cario, MC, Nelson, BL. Modeling and generating random vectors with arbitrary marginal distributions and correlation matrix. Citeseer: Technical report; 1997.Search in Google Scholar

19. Wedderburn, RWM. Quasi-likelihood functions, generalized linear models, and the Gauss-Newton method. Biometrika 1974;61:439–47. https://doi.org/10.2307/2334725.Search in Google Scholar

20. Bonat, WH, Jørgensen, B. Multivariate covariance generalized linear models. J Roy Stat Soc C Appl Stat 2016;65:649–75. https://doi.org/10.1111/rssc.12145.Search in Google Scholar

21. Holst, R, Jørgensen, B. Efficient and robust e stimation for a class of generalized linear longitudinal mixed models; 2010. arXiv preprint arXiv:1008.2870.Search in Google Scholar

22. Jørgensen, B, Knudsen, SJ. Parameter orthogonality and bias adjustment for estimating functions. Scand J Stat 2004;31:93–114. https://doi.org/10.1111/j.1467-9469.2004.00375.x.Search in Google Scholar

23. Petterle, RR, Bonat, WH, Kokonendji, CC, Seganfredo, JC, Moraes, A, da Silva, MG. Double Poisson-Tweedie regression models. Int J Biostat 2019;15:1–15. https://doi.org/10.1515/ijb-2018-0119.Search in Google Scholar

24. Godambe, VP, Thompson, M. Some aspects of the theory of estimating equations. J Stat Plann Inference 1978;2:95–104. https://doi.org/10.1016/0378-3758(78)90026-5.Search in Google Scholar

25. R Core Team. R: a language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing; 2019. Available from: https://www.R-project.org/.Search in Google Scholar

26. Bonat, WH. mcglm: Multivariate covariance generalized linear models; 2016. R package version 0.4.0.Search in Google Scholar

27. Bonat, WH. Multiple response variables regression models in R: the mcglm package. J Stat Software 2018;84:1–30. https://doi.org/10.18637/jss.v084.i04.Search in Google Scholar

28. Su, P. NORTARA: generation of multivariate data with arbitrary marginals; 2014. R package version 1.0.0.Search in Google Scholar

29. Cook, R D. Detection of influential observation in linear regression. Technometrics 1977;19:15–8. https://doi.org/10.1080/00401706.1977.10489493.Search in Google Scholar

30. Venezuela, MK, Aparecida Botter, D, Carneiro Sandoval, M. Diagnostic techniques in generalized estimating equations. J Stat Comput Simulat 2007;77:879–88. https://doi.org/10.1080/10629360600780488.Search in Google Scholar

31. Belsley, DA, Kuh, E, Welsch, RE. Regression diagnostics. New York: J, Wiley & Sons; 1980.10.1002/0471725153Search in Google Scholar


Supplementary material

http://www.leg.ufpr.br/doku.php/publications:papercompanions:multquasibeta


Received: 2019-12-23
Accepted: 2020-06-22
Published Online: 2020-08-01

© 2020 Walter de Gruyter GmbH, Berlin/Boston

Downloaded on 13.5.2024 from https://www.degruyter.com/document/doi/10.1515/ijb-2019-0163/html
Scroll to top button