当前位置: X-MOL 学术J. Chemometr. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Comparison of variable selection methods in partial least squares regression
Journal of Chemometrics ( IF 2.4 ) Pub Date : 2020-02-20 , DOI: 10.1002/cem.3226
Tahir Mehmood 1, 2 , Solve Sæbø 2 , Kristian Hovde Liland 2, 3
Affiliation  

Through the remarkable progress in technology, it is getting easier and easier to generate vast amounts of variables from a given sample. The selection of variables is imperative for data reduction and for understanding the modeled relationship. Partial least squares (PLS) regression is among the modeling approaches that address high throughput data. A considerable list of variable selection methods has been introduced in PLS. Most of these methods have been reviewed in a recently conducted study. Motivated by this, we have therefore conducted a comparison of available methods for variable selection within PLS. The main focus of this study was to reveal patterns of dependencies between variable selection method and data properties, which can guide the choice of method in practical data analysis. To this aim, a simulation study was conducted with data sets having diverse properties like the number of variables, the number of samples, model complexity level, and information content. The results indicate that the above factors like the number of variables, number of samples, model complexity level, information content and variant of PLS methods, and their mutual higher‐order interactions all significantly define the prediction capabilities of the model and the choice of variable selection strategy.

中文翻译:

偏最小二乘回归中变量选择方法的比较

随着技术的显着进步,从给定的样本中生成大量变量变得越来越容易。变量的选择对于数据简化和理解建模关系是必不可少的。偏最小二乘 (PLS) 回归是解决高吞吐量数据的建模方法之一。PLS 中引入了大量变量选择方法。在最近进行的一项研究中,对这些方法中的大多数进行了审查。受此启发,我们因此对 PLS 中变量选择的可用方法进行了比较。本研究的主要重点是揭示变量选择方法与数据属性之间的依赖关系模式,可以指导实际数据分析中的方法选择。为了这个目标,对具有不同属性的数据集进行了模拟研究,例如变量数量、样本数量、模型复杂程度和信息内容。结果表明,上述变量数量、样本数量、模型复杂程度、PLS方法的信息量和变量及其相互间的高阶相互作用等因素都显着定义了模型的预测能力和变量的选择。选择策略。
更新日期:2020-02-20
down
wechat
bug