当前位置: X-MOL 学术Ann. Stat. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Minimax optimal rates for Mondrian trees and forests
Annals of Statistics ( IF 4.5 ) Pub Date : 2020-08-01 , DOI: 10.1214/19-aos1886
Jaouad Mourtada , Stéphane Gaïffas , Erwan Scornet

Introduced by Breiman (2001), Random Forests are widely used as classification and regression algorithms. While being initially designed as batch algorithms, several variants have been proposed to handle online learning. One particular instance of such forests is the Mondrian Forest, whose trees are built using the so-called Mondrian process, therefore allowing to easily update their construction in a streaming fashion. In this paper, we study Mondrian Forests in a batch setting and prove their consistency assuming a proper tuning of the lifetime sequence. A thorough theoretical study of Mondrian partitions allows us to derive an upper bound for the risk of Mondrian Forests, which turns out to be the minimax optimal rate for both Lipschitz and twice differentiable regression functions. These results are actually the first to state that some particular random forests achieve minimax rates \textit{in arbitrary dimension}, paving the way to a refined theoretical analysis and thus a deeper understanding of these black box algorithms.

中文翻译:

蒙德里安树和森林的极小极大最优率

Breiman (2001) 提出,随机森林被广泛用作分类和回归算法。虽然最初被设计为批处理算法,但已经提出了几种变体来处理在线学习。这种森林的一个特殊实例是蒙德里安森林,其树木是使用所谓的蒙德里安过程建造的,因此可以轻松地以流式方式更新其结构。在本文中,我们在批处理设置中研究蒙德里安森林,并假设生命周期序列的适当调整来证明它们的一致性。对蒙德里安分区的深入理论研究使我们能够推导出蒙德里安森林风险的上限,结果证明它是 Lipschitz 和二次可微回归函数的最小最大最优率。
更新日期:2020-08-01
down
wechat
bug