Bayesian additive regression trees with model trees

Prado, Estevão B.; Moral, Rafael A.; Parnell, Andrew C.

doi:10.1007/s11222-021-09997-3

Bayesian additive regression trees with model trees

Published: 03 March 2021

Volume 31, article number 20, (2021)
Cite this article

Statistics and Computing Aims and scope Submit manuscript

Estevão B. Prado ORCID: orcid.org/0000-0003-2320-085X^1,2,
Rafael A. Moral¹ &
Andrew C. Parnell^1,2

1301 Accesses
10 Citations
1 Altmetric
Explore all metrics

Abstract

Bayesian additive regression trees (BART) is a tree-based machine learning method that has been successfully applied to regression and classification problems. BART assumes regularisation priors on a set of trees that work as weak learners and is very flexible for predicting in the presence of nonlinearity and high-order interactions. In this paper, we introduce an extension of BART, called model trees BART (MOTR-BART), that considers piecewise linear functions at node levels instead of piecewise constants. In MOTR-BART, rather than having a unique value at node level for the prediction, a linear predictor is estimated considering the covariates that have been used as the split variables in the corresponding tree. In our approach, local linearities are captured more efficiently and fewer trees are required to achieve equal or better performance than BART. Via simulation studies and real data applications, we compare MOTR-BART to its main competitors. R code for MOTR-BART implementation is available at https://github.com/ebprado/MOTR-BART.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Automatic piecewise linear regression

Article Open access 01 March 2024

An explicit split point procedure in model-based trees allowing for a quick fitting of GLM trees and GLM forests

Article 11 November 2021

Comparison Between Suitable Priors for Additive Bayesian Networks

References

Albert, J.H., Chib, S.: Bayesian analysis of binary and polychotomous response data. J. Am. Stat. Assoc. 88(422), 669–679 (1993)
Article MathSciNet Google Scholar
Athey, S., Tibshirani, J., Wager, S., et al.: Generalized random forests. Ann. Stati. 47(2), 1148–1178 (2019)
Article MathSciNet Google Scholar
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
Article Google Scholar
Carvalho, C.M., Polson, N.G., Scott, J.G.: The horseshoe estimator for sparse signals. Biometrika 97(2), 465–480 (2010)
Article MathSciNet Google Scholar
Chipman, H.A., George, E.I., McCulloch, R.E.: Bayesian cart model search. J. Am. Stat. Assoc. 93(443), 935–948 (1998)
Article Google Scholar
Chipman, H.A., George, E.I., McCulloch, R.E., et al.: Bart: Bayesian additive regression trees. Ann. Appl. Stat. 4(1), 266–298 (2010)
Article MathSciNet Google Scholar
Deshpande, S.K., Bai, R., Balocchi, C., Starling, J.E.: (2020) Vc-bart: Bayesian trees for varying coefficients. arXiv preprint arXiv:2003.06416
Friedberg, R., Tibshirani, J., Athey, S., Wager, S.: Local linear forests. (2018) arXiv preprint arXiv:1807.11408
Friedman, J., Hastie, T., Tibshirani, R.: Regularization paths for generalized linear models via coordinate descent. J. Stat. Softw. 33(1), 1 (2010)
Article Google Scholar
Friedman, J.H.: Multivariate adaptive regression splines. The annals of statistics pp 1–67 (1991)
Friedman, J.H.: Greedy function approximation: a gradient boosting machine. Ann. Stat. pp. 1189–1232 (2001)
Green, D.P., Kern, H.L.: Modeling heterogeneous treatment effects in survey experiments with bayesian additive regression trees. Public Opin. Quart. 76(3), 491–511 (2012)
Article Google Scholar
Greenwell, B., Boehmke, B., Cunningham, J., Developers, G.: gbm: Generalized boosted regression models. https://CRAN.R-project.org/package=gbm, r package version 2.1.5 (2019)
Hahn, P.R., Murray, J.S., Carvalho, C.M., et al.: Bayesian regression tree models for causal inference: regularization, confounding, and heterogeneous effects. Bayesian Analysis (2020)
He, J., Yalov, S., Hahn, P.R.: Xbart: Accelerated Bayesian additive regression trees. In: Proceedings of the 22nd international conference on artificial intelligence and statistics 89 (2019)
Hernández, B., Pennington, S.R., Parnell, A.C.: Bayesian methods for proteomic biomarker development. EuPA Open Proteom. 9, 54–64 (2015)
Article Google Scholar
Hernández, B., Raftery, A.E., Pennington, S.R., Parnell, A.C.: Bayesian additive regression trees using bayesian model averaging. Stat. Comput. 28(4), 869–890 (2018)
Article MathSciNet Google Scholar
Hill, J.L.: Bayesian nonparametric modeling for causal inference. J. Comput. Gr. Stat. 20(1), 217–240 (2011)
Article MathSciNet Google Scholar
Kapelner, A., Bleich, J.: bartMachine: Machine learning with Bayesian additive regression trees. J. Stat. Softw. 70(4), 1–40 (2016). https://doi.org/10.18637/jss.v070.i04
Article Google Scholar
Kindo, B.P., Wang, H., Hanson, T., Peña, E.A.: (2016a) Bayesian quantile additive regression trees. arXiv preprint arXiv:1607.02676
Kindo, B.P., Wang, H., Peña, E.A.: Multinomial probit bayesian additive regression trees. Stat 5(1), 119–131 (2016b)
Article MathSciNet Google Scholar
Künzel, S.R., Saarinen, T.F., Liu, E.W., Sekhon, J.S.: Linear aggregation in tree-based estimators (2019) arXiv preprint arXiv:1906.06463
Landwehr, N., Hall, M., Frank, E.: Logistic model trees. Mach. Learn. 59(1–2), 161–205 (2005)
Article Google Scholar
Linero, A.: SoftBart: A package for implementing the SoftBart algorithm. R package version 1, (2017a)
Linero, A.R.: A review of tree-based bayesian methods. Commun. Stat. Appl. Methods 24(6), (2017b)
Linero, A.R.: Bayesian regression trees for high-dimensional prediction and variable selection. J. Am. Stat. Assoc. 113(522), 626–636 (2018)
Article MathSciNet Google Scholar
Linero, A.R., Yang, Y.: Bayesian regression tree ensembles that adapt to smoothness and sparsity. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 80(5), 1087–1110 (2018)
Article MathSciNet Google Scholar
Linero, A.R., Sinha, D., Lipsitz, S.R.: Semiparametric mixed-scale models using shared bayesian forests (2018) arXiv preprint arXiv:1809.08521
McCulloch, R., Sparapani, R., Gramacy, R., Spanbauer, C., Pratola, M.: BART: Bayesian Additive Regression Trees. https://CRAN.R-project.org/package=BART, r package version 2.7 (2019)
Murray, J.S.: Log-linear bayesian additive regression trees for categorical and count responses.(2017) arXiv preprint arXiv:1701.01503
Pratola, M., Chipman, H., George, E., McCulloch, R.: Heteroscedastic bart using multiplicative regression trees (2017). arXiv preprint arXiv:1709.07542
Quinlan, J.R.: Learning with continuous classes. In: 5th Australian joint conference on artificial intelligence, World Scientific, vol 92, pp 343–348 (1992)
R Core Team R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna (2020) https://www.R-project.org/
Ročková, V., van der Pas, S.: Posterior concentration for bayesian regression trees and forests (2017). arXiv preprint arXiv:1708.08734
Ročková, V., Saha, E.: On theory for bart (2018). arXiv preprint arXiv:1810.00787
Schnell, P.M., Tang, Q., Offen, W.W., Carlin, B.P.: A bayesian credible subgroups approach to identifying patient subgroups with positive treatment effects. Biometrics 72(4), 1026–1036 (2016)
Article MathSciNet Google Scholar
Sivaganesan, S., Müller, P., Huang, B.: Subgroup finding via bayesian additive regression trees. Stat. Med. 36(15), 2391–2403 (2017)
Article MathSciNet Google Scholar
Sparapani, R., Logan, B.R., McCulloch, R.E., Laud, P.W.: Nonparametric competing risks analysis using bayesian additive regression trees. Stat. Methods Med. Res. p 0962280218822140 (2019)
Sparapani, R.A., Logan, B.R., McCulloch, R.E., Laud, P.W.: Nonparametric survival analysis using bayesian additive regression trees (bart). Stat. Med. 35(16), 2741–2753 (2016)
Article MathSciNet Google Scholar
Starling, J.E., Aiken, C.E., Murray, J.S., Nakimuli, A., Scott, J.G.: Monotone function estimation in the presence of extreme data coarsening: Analysis of preeclampsia and birth weight in urban uganda (2019). arXiv preprint arXiv:19120.6946
Starling, J.E., Murray, J.S., Carvalho, C.M., Bukowski, R.K., Scott, J.G., et al.: Bart with targeted smoothing: an analysis of patient-specific stillbirth risk. Ann. Appl. Stat. 14(1), 28–50 (2020)
Article MathSciNet Google Scholar
Tibshirani, J., Athey, S., Wager, S.: grf: Generalized Random Forests (2020). https://CRAN.R-project.org/package=grf, r package version 1.2.0
Tibshirani, R.: Regression shrinkage and selection via the lasso. J. Roy. Stat. Soc.: Ser. B (Methodol.) 58(1), 267–288 (1996)
MathSciNet MATH Google Scholar
Wang, Y., Witten, I., van Someren, M., Widmer, G.: Inducing models trees for continuous classes. In: Proceedings of the Poster Papers of the European Conference on Machine Learning, Department of Computer Science, University of Waikato, New Zeland (1997)
Wright, M.N., Ziegler, A.: ranger: A fast implementation of random forests for high dimensional data in C++ and R. J. Stat. Softw. 77(1), 1–17 (2017). https://doi.org/10.18637/jss.v077.i01
Article Google Scholar
Zhang, J.L., Härdle, W.K.: The bayesian additive classification tree applied to credit risk modelling. Comput. Stat. Data Anal. 54(5), 1197–1205 (2010)
Article MathSciNet Google Scholar

Download references

Acknowledgements

We thank the editors and the two anonymous referees for their comments that greatly improved the earlier version of the paper. This work was supported by a Science Foundation Ireland Career Development Award grant number 17/CDA/4695.

Author information

Authors and Affiliations

Hamilton Institute and Department of Mathematics and Statistics, Maynooth University, Maynooth, Ireland
Estevão B. Prado, Rafael A. Moral & Andrew C. Parnell
Insight Centre for Data Analytics, Maynooth University, Maynooth, Ireland
Estevão B. Prado & Andrew C. Parnell

Authors

Estevão B. Prado
View author publications
You can also search for this author in PubMed Google Scholar
Rafael A. Moral
View author publications
You can also search for this author in PubMed Google Scholar
Andrew C. Parnell
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Estevão B. Prado.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A: Simulation results

In this section, we present results related to the simulation scenarios shown in Sect. 5.1. In total, 9 data sets were created based on Friedman’s equation considering some combinations of sample size (n) and number of covariates (p). In Tables 1 and 2 , the medians and quartiles of the RMSE are shown for the algorithms MOTR-BART, BART, GB, RF, lasso, soft BART and LLF. The values in this table were graphically shown in Fig. 3. In addition, Table 3 presents the mean number of parameters utilised by BART, MOTR-BART and soft BART to calculate the final prediction.

Table 1 Median of the RMSE on test data of the Friedman data sets when \(n = 200 \text{ and } 500\)

Full size table

Table 2 Median of the RMSE on test data of the Friedman data sets when \(n = 1000\)

Full size table

Table 3 Friedman data sets: mean and standard deviation of the total number of terminal nodes created for BART and soft BART to generate the final prediction over 5000 iterations

Full size table

Appendix B: Real data results

This appendix presents two tables with results associated with the data sets Ankara, Boston, Ozone and Compactiv. In Table 4, it is reported the median and quartiles of the RMSE computed on 10 test sets. The values in this table are related to the Fig. 4 from Sect. 5.2. Further, Table 5 shows the mean number of parameters utilised by BART, MOTR-BART and soft BART to calculate the final prediction for the aforementioned data sets.

Table 4 Real data sets: comparison of the median RMSE (and first and third quartiles) for Ankara, Boston, Ozone and Compactiv data sets on test data

Full size table

Table 5 Real data sets: mean and standard deviation of the total number of terminal nodes created for BART and soft BART to generate the final prediction over 5000 iterations

Full size table

Rights and permissions

Reprints and permissions

About this article

Cite this article

Prado, E.B., Moral, R.A. & Parnell, A.C. Bayesian additive regression trees with model trees. Stat Comput 31, 20 (2021). https://doi.org/10.1007/s11222-021-09997-3

Download citation

Received: 09 June 2020
Accepted: 11 January 2021
Published: 03 March 2021
DOI: https://doi.org/10.1007/s11222-021-09997-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Bayesian additive regression trees with model trees

Abstract

Access this article

Similar content being viewed by others

Automatic piecewise linear regression

An explicit split point procedure in model-based trees allowing for a quick fitting of GLM trees and GLM forests

Comparison Between Suitable Priors for Additive Bayesian Networks

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendices

Appendix A: Simulation results

Appendix B: Real data results

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Bayesian additive regression trees with model trees

Abstract

Access this article

Similar content being viewed by others

Automatic piecewise linear regression

An explicit split point procedure in model-based trees allowing for a quick fitting of GLM trees and GLM forests

Comparison Between Suitable Priors for Additive Bayesian Networks

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendices

Appendix A: Simulation results

Appendix B: Real data results

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation