当前位置: X-MOL 学术Stat. Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Regularizing axis-aligned ensembles via data rotations that favor simpler learners
Statistics and Computing ( IF 1.6 ) Pub Date : 2021-01-27 , DOI: 10.1007/s11222-020-09973-3
Rico Blaser , Piotr Fryzlewicz

To overcome the inherent limitations of axis-aligned base learners in ensemble learning, several methods of rotating the feature space have been discussed in the literature. In particular, smoother decision boundaries can often be obtained from axis-aligned ensembles by rotating the feature space. In the present paper, we introduce a low-cost regularization technique that favors rotations which produce compact base learners. The restated problem adds a shrinkage term to the loss function that explicitly accounts for the complexity of the base learners. For example, for tree-based ensembles, we apply a penalty based on the median number of nodes and the median depth of the trees in the forest. Rather than jointly minimizing prediction error and model complexity, which is computationally infeasible, we first generate a prioritized weighting of the available feature rotations that promotes lower model complexity and subsequently minimize prediction errors on each of the selected rotations. We show that the resulting ensembles tend to be significantly more dense, faster to evaluate, and competitive at generalizing in out-of-sample predictions.



中文翻译:

通过有利于更简单学习者的数据旋转来对轴对齐的合奏进行正则化

为了克服整体学习中轴对齐基础学习器的固有局限性,文献中讨论了几种旋转特征空间的方法。特别是,通常可以通过旋转特征空间从对齐轴的集合中获得更平滑的决策边界。在本文中,我们介绍了一种低成本的正则化技术,该技术有利于轮换,从而产生紧凑的基础学习器。重述的问题为损失函数增加了一个收缩项,明确地说明了基础学习者的复杂性。例如,对于基于树的合奏,我们基于节点的中位数和森林中树木的中值深度进行惩罚。与其共同降低计算上不可行的预测误差和模型复杂度,我们首先对可用特征旋转生成优先权重,以提高模型的复杂度,并随后将每个选定旋转的预测误差降至最低。我们表明,所得合奏倾向于密度更大,评估更快,并且在样本外预测的泛化方面具有竞争力。

更新日期:2021-01-28
down
wechat
bug