Abstract
Bayesian additive regression trees (BART) is a tree-based machine learning method that has been successfully applied to regression and classification problems. BART assumes regularisation priors on a set of trees that work as weak learners and is very flexible for predicting in the presence of nonlinearity and high-order interactions. In this paper, we introduce an extension of BART, called model trees BART (MOTR-BART), that considers piecewise linear functions at node levels instead of piecewise constants. In MOTR-BART, rather than having a unique value at node level for the prediction, a linear predictor is estimated considering the covariates that have been used as the split variables in the corresponding tree. In our approach, local linearities are captured more efficiently and fewer trees are required to achieve equal or better performance than BART. Via simulation studies and real data applications, we compare MOTR-BART to its main competitors. R code for MOTR-BART implementation is available at https://github.com/ebprado/MOTR-BART.
Similar content being viewed by others
References
Albert, J.H., Chib, S.: Bayesian analysis of binary and polychotomous response data. J. Am. Stat. Assoc. 88(422), 669–679 (1993)
Athey, S., Tibshirani, J., Wager, S., et al.: Generalized random forests. Ann. Stati. 47(2), 1148–1178 (2019)
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
Carvalho, C.M., Polson, N.G., Scott, J.G.: The horseshoe estimator for sparse signals. Biometrika 97(2), 465–480 (2010)
Chipman, H.A., George, E.I., McCulloch, R.E.: Bayesian cart model search. J. Am. Stat. Assoc. 93(443), 935–948 (1998)
Chipman, H.A., George, E.I., McCulloch, R.E., et al.: Bart: Bayesian additive regression trees. Ann. Appl. Stat. 4(1), 266–298 (2010)
Deshpande, S.K., Bai, R., Balocchi, C., Starling, J.E.: (2020) Vc-bart: Bayesian trees for varying coefficients. arXiv preprint arXiv:2003.06416
Friedberg, R., Tibshirani, J., Athey, S., Wager, S.: Local linear forests. (2018) arXiv preprint arXiv:1807.11408
Friedman, J., Hastie, T., Tibshirani, R.: Regularization paths for generalized linear models via coordinate descent. J. Stat. Softw. 33(1), 1 (2010)
Friedman, J.H.: Multivariate adaptive regression splines. The annals of statistics pp 1–67 (1991)
Friedman, J.H.: Greedy function approximation: a gradient boosting machine. Ann. Stat. pp. 1189–1232 (2001)
Green, D.P., Kern, H.L.: Modeling heterogeneous treatment effects in survey experiments with bayesian additive regression trees. Public Opin. Quart. 76(3), 491–511 (2012)
Greenwell, B., Boehmke, B., Cunningham, J., Developers, G.: gbm: Generalized boosted regression models. https://CRAN.R-project.org/package=gbm, r package version 2.1.5 (2019)
Hahn, P.R., Murray, J.S., Carvalho, C.M., et al.: Bayesian regression tree models for causal inference: regularization, confounding, and heterogeneous effects. Bayesian Analysis (2020)
He, J., Yalov, S., Hahn, P.R.: Xbart: Accelerated Bayesian additive regression trees. In: Proceedings of the 22nd international conference on artificial intelligence and statistics 89 (2019)
Hernández, B., Pennington, S.R., Parnell, A.C.: Bayesian methods for proteomic biomarker development. EuPA Open Proteom. 9, 54–64 (2015)
Hernández, B., Raftery, A.E., Pennington, S.R., Parnell, A.C.: Bayesian additive regression trees using bayesian model averaging. Stat. Comput. 28(4), 869–890 (2018)
Hill, J.L.: Bayesian nonparametric modeling for causal inference. J. Comput. Gr. Stat. 20(1), 217–240 (2011)
Kapelner, A., Bleich, J.: bartMachine: Machine learning with Bayesian additive regression trees. J. Stat. Softw. 70(4), 1–40 (2016). https://doi.org/10.18637/jss.v070.i04
Kindo, B.P., Wang, H., Hanson, T., Peña, E.A.: (2016a) Bayesian quantile additive regression trees. arXiv preprint arXiv:1607.02676
Kindo, B.P., Wang, H., Peña, E.A.: Multinomial probit bayesian additive regression trees. Stat 5(1), 119–131 (2016b)
Künzel, S.R., Saarinen, T.F., Liu, E.W., Sekhon, J.S.: Linear aggregation in tree-based estimators (2019) arXiv preprint arXiv:1906.06463
Landwehr, N., Hall, M., Frank, E.: Logistic model trees. Mach. Learn. 59(1–2), 161–205 (2005)
Linero, A.: SoftBart: A package for implementing the SoftBart algorithm. R package version 1, (2017a)
Linero, A.R.: A review of tree-based bayesian methods. Commun. Stat. Appl. Methods 24(6), (2017b)
Linero, A.R.: Bayesian regression trees for high-dimensional prediction and variable selection. J. Am. Stat. Assoc. 113(522), 626–636 (2018)
Linero, A.R., Yang, Y.: Bayesian regression tree ensembles that adapt to smoothness and sparsity. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 80(5), 1087–1110 (2018)
Linero, A.R., Sinha, D., Lipsitz, S.R.: Semiparametric mixed-scale models using shared bayesian forests (2018) arXiv preprint arXiv:1809.08521
McCulloch, R., Sparapani, R., Gramacy, R., Spanbauer, C., Pratola, M.: BART: Bayesian Additive Regression Trees. https://CRAN.R-project.org/package=BART, r package version 2.7 (2019)
Murray, J.S.: Log-linear bayesian additive regression trees for categorical and count responses.(2017) arXiv preprint arXiv:1701.01503
Pratola, M., Chipman, H., George, E., McCulloch, R.: Heteroscedastic bart using multiplicative regression trees (2017). arXiv preprint arXiv:1709.07542
Quinlan, J.R.: Learning with continuous classes. In: 5th Australian joint conference on artificial intelligence, World Scientific, vol 92, pp 343–348 (1992)
R Core Team R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna (2020) https://www.R-project.org/
Ročková, V., van der Pas, S.: Posterior concentration for bayesian regression trees and forests (2017). arXiv preprint arXiv:1708.08734
Ročková, V., Saha, E.: On theory for bart (2018). arXiv preprint arXiv:1810.00787
Schnell, P.M., Tang, Q., Offen, W.W., Carlin, B.P.: A bayesian credible subgroups approach to identifying patient subgroups with positive treatment effects. Biometrics 72(4), 1026–1036 (2016)
Sivaganesan, S., Müller, P., Huang, B.: Subgroup finding via bayesian additive regression trees. Stat. Med. 36(15), 2391–2403 (2017)
Sparapani, R., Logan, B.R., McCulloch, R.E., Laud, P.W.: Nonparametric competing risks analysis using bayesian additive regression trees. Stat. Methods Med. Res. p 0962280218822140 (2019)
Sparapani, R.A., Logan, B.R., McCulloch, R.E., Laud, P.W.: Nonparametric survival analysis using bayesian additive regression trees (bart). Stat. Med. 35(16), 2741–2753 (2016)
Starling, J.E., Aiken, C.E., Murray, J.S., Nakimuli, A., Scott, J.G.: Monotone function estimation in the presence of extreme data coarsening: Analysis of preeclampsia and birth weight in urban uganda (2019). arXiv preprint arXiv:19120.6946
Starling, J.E., Murray, J.S., Carvalho, C.M., Bukowski, R.K., Scott, J.G., et al.: Bart with targeted smoothing: an analysis of patient-specific stillbirth risk. Ann. Appl. Stat. 14(1), 28–50 (2020)
Tibshirani, J., Athey, S., Wager, S.: grf: Generalized Random Forests (2020). https://CRAN.R-project.org/package=grf, r package version 1.2.0
Tibshirani, R.: Regression shrinkage and selection via the lasso. J. Roy. Stat. Soc.: Ser. B (Methodol.) 58(1), 267–288 (1996)
Wang, Y., Witten, I., van Someren, M., Widmer, G.: Inducing models trees for continuous classes. In: Proceedings of the Poster Papers of the European Conference on Machine Learning, Department of Computer Science, University of Waikato, New Zeland (1997)
Wright, M.N., Ziegler, A.: ranger: A fast implementation of random forests for high dimensional data in C++ and R. J. Stat. Softw. 77(1), 1–17 (2017). https://doi.org/10.18637/jss.v077.i01
Zhang, J.L., Härdle, W.K.: The bayesian additive classification tree applied to credit risk modelling. Comput. Stat. Data Anal. 54(5), 1197–1205 (2010)
Acknowledgements
We thank the editors and the two anonymous referees for their comments that greatly improved the earlier version of the paper. This work was supported by a Science Foundation Ireland Career Development Award grant number 17/CDA/4695.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix A: Simulation results
In this section, we present results related to the simulation scenarios shown in Sect. 5.1. In total, 9 data sets were created based on Friedman’s equation considering some combinations of sample size (n) and number of covariates (p). In Tables 1 and 2 , the medians and quartiles of the RMSE are shown for the algorithms MOTR-BART, BART, GB, RF, lasso, soft BART and LLF. The values in this table were graphically shown in Fig. 3. In addition, Table 3 presents the mean number of parameters utilised by BART, MOTR-BART and soft BART to calculate the final prediction.
Appendix B: Real data results
This appendix presents two tables with results associated with the data sets Ankara, Boston, Ozone and Compactiv. In Table 4, it is reported the median and quartiles of the RMSE computed on 10 test sets. The values in this table are related to the Fig. 4 from Sect. 5.2. Further, Table 5 shows the mean number of parameters utilised by BART, MOTR-BART and soft BART to calculate the final prediction for the aforementioned data sets.
Rights and permissions
About this article
Cite this article
Prado, E.B., Moral, R.A. & Parnell, A.C. Bayesian additive regression trees with model trees. Stat Comput 31, 20 (2021). https://doi.org/10.1007/s11222-021-09997-3
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11222-021-09997-3