当前位置: X-MOL 学术J. Classif. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
On Cesáro Averages for Weighted Trees in the Random Forest
Journal of Classification ( IF 2 ) Pub Date : 2019-03-30 , DOI: 10.1007/s00357-019-09322-8
Hieu Pham , Sigurður Olafsson

The random forest is a popular and effective classification method. It uses a combination of bootstrap resampling and subspace sampling to construct an ensemble of decision trees that are then averaged for a final prediction. In this paper, we propose a potential improvement on the random forest that can be thought of as applying a weight to each tree before averaging. The new method is motivated by the potential instability of averaging predictions of trees that may be of highly variable quality, and because of this, we replace the regular average with a Cesáro average. We provide both a theoretical analysis that gives exact conditions under which the new approach outperforms the traditional random forest, and numerical analysis that shows the new approach is competitive when training a classification model on numerous realistic data sets.

中文翻译:

关于随机森林中加权树的 Cesáro 平均值

随机森林是一种流行且有效的分类方法。它使用自举重采样和子空间采样的组合来构建决策树的集合,然后对最终预测进行平均。在本文中,我们提出了对随机森林的潜在改进,可以将其视为在平均之前对每棵树应用权重。新方法的动机是对可能具有高度可变质量的树木进行平均预测的潜在不稳定性,因此,我们用 Cesáro 平均值替换了常规平均值。我们提供了一个理论分析,给出了新方法优于传统随机森林的确切条件,以及表明新方法在大量真实数据集上训练分类模型时具有竞争力的数值分析。
更新日期:2019-03-30
down
wechat
bug