Posterior concentration for Bayesian regression trees and forests,Annals of Statistics

当前位置： X-MOL 学术 › Ann. Stat. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Posterior concentration for Bayesian regression trees and forests
Annals of Statistics ( IF 3.2 ) Pub Date : 2020-08-01 , DOI: 10.1214/19-aos1879
Veronika Ročková , Stéphanie van der Pas

Since their inception in the 1980's, regression trees have been one of the more widely used non-parametric prediction methods. Tree-structured methods yield a histogram reconstruction of the regression surface, where the bins correspond to terminal nodes of recursive partitioning. Trees are powerful, yet susceptible to over-fitting. Strategies against overfitting have traditionally relied on pruning greedily grown trees. The Bayesian framework offers an alternative remedy against overfitting through priors. Roughly speaking, a good prior charges smaller trees where overfitting does not occur. While the consistency of random histograms, trees and their ensembles has been studied quite extensively, the theoretical understanding of the Bayesian counterparts has been missing. In this paper, we take a step towards understanding why/when do Bayesian trees and their ensembles not overfit. To address this question, we study the speed at which the posterior concentrates around the true smooth regression function. We propose a spike-and-tree variant of the popular Bayesian CART prior and establish new theoretical results showing that regression trees (and their ensembles) (a) are capable of recovering smooth regression surfaces, achieving optimal rates up to a log factor, (b) can adapt to the unknown level of smoothness and (c) can perform effective dimension reduction when p>n. These results provide a piece of missing theoretical evidence explaining why Bayesian trees (and additive variants thereof) have worked so well in practice.

中文翻译：

贝叶斯回归树和森林的后验浓度

自 1980 年代问世以来，回归树一直是应用更广泛的非参数预测方法之一。树结构方法产生回归表面的直方图重建，其中 bin 对应于递归分区的终端节点。树很强大，但容易过度拟合。传统上，防止过度拟合的策略依赖于修剪贪婪生长的树木。贝叶斯框架提供了一种针对先验过度拟合的替代补救措施。粗略地说，良好的先验对不会发生过度拟合的较小树进行收费。虽然随机直方图、树及其集成的一致性已经得到了相当广泛的研究，但对贝叶斯对应物的理论理解却一直缺失。在本文中，我们进一步了解为什么/何时贝叶斯树及其集成不会过拟合。为了解决这个问题，我们研究了后验集中在真正的平滑回归函数周围的速度。我们提出了流行的贝叶斯 CART 先验的尖峰和树变体，并建立了新的理论结果，表明回归树（及其集成）（a）能够恢复平滑的回归曲面，实现高达对数因子的最佳速率，（ b) 可以适应未知水平的平滑度并且 (c) 当 p>n 时可以执行有效的降维。这些结果提供了一条缺失的理论证据，解释了为什么贝叶斯树（及其附加变体）在实践中效果如此之好。我们研究了后验集中在真正的平滑回归函数周围的速度。我们提出了流行的贝叶斯 CART 先验的尖峰和树变体，并建立了新的理论结果，表明回归树（及其集成）（a）能够恢复平滑的回归曲面，实现高达对数因子的最佳速率，（ b) 可以适应未知水平的平滑度并且 (c) 当 p>n 时可以执行有效的降维。这些结果提供了一条缺失的理论证据，解释了为什么贝叶斯树（及其附加变体）在实践中表现如此出色。我们研究了后验集中在真正的平滑回归函数周围的速度。我们提出了流行的贝叶斯 CART 先验的尖峰和树变体，并建立了新的理论结果，表明回归树（及其集成）（a）能够恢复平滑的回归曲面，实现高达对数因子的最佳速率，（ b) 可以适应未知水平的平滑度并且 (c) 当 p>n 时可以执行有效的降维。这些结果提供了一条缺失的理论证据，解释了为什么贝叶斯树（及其附加变体）在实践中表现如此出色。我们提出了流行的贝叶斯 CART 先验的尖峰和树变体，并建立了新的理论结果，表明回归树（及其集成）（a）能够恢复平滑的回归曲面，实现高达对数因子的最佳速率，（ b) 可以适应未知水平的平滑度并且 (c) 当 p>n 时可以执行有效的降维。这些结果提供了一条缺失的理论证据，解释了为什么贝叶斯树（及其附加变体）在实践中表现如此出色。我们提出了流行的贝叶斯 CART 先验的尖峰和树变体，并建立了新的理论结果，表明回归树（及其集成）（a）能够恢复平滑的回归曲面，实现高达对数因子的最佳速率，（ b) 可以适应未知水平的平滑度并且 (c) 当 p>n 时可以执行有效的降维。这些结果提供了一条缺失的理论证据，解释了为什么贝叶斯树（及其附加变体）在实践中表现如此出色。

更新日期：2020-08-01

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文