The Stata Journal: Promoting communications on statistics and Stata ( IF 3.2 ) Pub Date : 2020-03-24 , DOI: 10.1177/1536867x20909688 Matthias Schonlau 1 , Rosie Yuyan Zou 1
Random forests (Breiman, 2001, Machine Learning 45: 5–32) is a statistical- or machine-learning algorithm for prediction. In this article, we introduce a corresponding new command, rforest. We overview the random forest algorithm and illustrate its use with two examples: The first example is a classification problem that predicts whether a credit card holder will default on his or her debt. The second example is a regression problem that predicts the logscaled number of shares of online news articles. We conclude with a discussion that summarizes key points demonstrated in the examples.
中文翻译:
统计学习的随机森林算法
随机森林(Breiman,2001,机器学习45:5–32)是一种用于预测的统计或机器学习算法。在本文中,我们介绍了一个相应的新命令rforest。我们概述了随机森林算法,并通过两个示例说明了其用法:第一个示例是一个分类问题,可预测信用卡持有人是否会拖欠其债务。第二个示例是一个回归问题,它预测了在线新闻文章的对数比例份额。我们以讨论结尾,总结了示例中展示的关键点。