当前位置: X-MOL 学术Appl. Stoch. Models Bus.Ind. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Interval selection: A case-study-based approach
Applied Stochastic Models in Business and Industry ( IF 1.3 ) Pub Date : 2021-03-27 , DOI: 10.1002/asmb.2611
Rosa Arboretti 1 , Riccardo Ceccato 2 , Luca Pegoraro 2 , Luigi Salmaso 2
Affiliation  

Variable selection plays a fundamental role in the analysis of data containing several variables which are redundant or irrelevant to the problem of interest. The ability to identify and discard these variables would make it possible to improve predictive performances and data interpretation, thus reducing costs and computational time. Although many methods have been proposed for feature selection, in some fields there is more interest in selecting groups of variables because of the continuous nature and covariance of adjacent data. This is the case for near-infrared spectroscopy, where several methods, mainly based on partial least squares regression, have been proposed to deal with interval selection. In this article, we consider some of these methods and propose an additional solution based on a variable clustering procedure (Cov/VSURF), Lasso regression and permutation tests. We compare their performances on four different public datasets and discuss the impact of interval selection on the predictive performances of the considered models.

中文翻译:

间隔选择:基于案例研究的方法

变量选择在分析包含多个与感兴趣的问题冗余或无关的变量的数据中起着重要作用。识别和丢弃这些变量的能力将使提高预测性能和数据解释成为可能,从而降低成本和计算时间。尽管已经提出了许多用于特征选择的方法,但在某些领域,由于相邻数据的连续性和协方差性,对选择变量组更感兴趣。近红外光谱就是这种情况,其中已经提出了几种主要基于偏最小二乘回归的方法来处理区间选择。在本文中,我们考虑了其中的一些方法,并提出了一种基于变量聚类程序 (Cov/VSURF) 的附加解决方案,套索回归和置换测试。我们比较了它们在四个不同公共数据集上的表现,并讨论了区间选择对所考虑模型的预测性能的影响。
更新日期:2021-03-27
down
wechat
bug