Forward stepwise random forest analysis for experimental designs,Journal of Quality Technology

当前位置： X-MOL 学术 › J. Equal. Technol. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Forward stepwise random forest analysis for experimental designs
Journal of Quality Technology ( IF 2.6 ) Pub Date : 2021-01-28
Chang-Yun Lin

Abstract

In experimental designs, it is usually assumed that the data follow normal distributions and the models have linear structures. In practice, experimenters may encounter different types of responses and be uncertain about model structures. If this is the case, traditional methods, such as the ANOVA and regression, are not suitable for data analysis and model selection. We introduce the random forest analysis, which is a powerful machine learning method capable of analyzing numerical and categorical data with complicated model structures. To perform model selection and factor identification with the random forest method, we propose a forward stepwise algorithm and develop Python and R codes based on minimizing the OOB error. Six examples including simulation and case studies are provided. We compare the performance of the proposed method and some frequently used analysis methods. Results show that the forward stepwise random forest analysis, in general, has a high power for identifying active factors and selects models that have high prediction accuracy.

中文翻译：

用于实验设计的正向逐步随机森林分析

摘要

在实验设计中，通常假设数据遵循正态分布，并且模型具有线性结构。在实践中，实验者可能会遇到不同类型的响应，并且对模型结构不确定。在这种情况下，ANOVA和回归等传统方法不适用于数据分析和模型选择。我们介绍了随机森林分析，它是一种强大的机器学习方法，能够分析具有复杂模型结构的数值和分类数据。为了使用随机森林方法进行模型选择和因子识别，我们提出了一种前向逐步算法，并在最小化OOB误差的基础上开发了Python和R代码。提供了六个示例，包括模拟和案例研究。我们比较了所提出的方法和一些常用分析方法的性能。结果表明，通常，前向逐步随机森林分析具有较高的识别活动因子的能力，并且可以选择具有较高预测精度的模型。

更新日期：2021-01-28

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11