当前位置: X-MOL 学术J. Chemometr. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Relevant and irrelevant predictors in PLS2
Journal of Chemometrics ( IF 2.4 ) Pub Date : 2020-04-07 , DOI: 10.1002/cem.3237
Matteo Stocchero 1
Affiliation  

Partial least square regression (PLS) is largely applied to solve regression problems when correlation and redundancy are present in the data. In spite of many studies about feature selection and variable importance have been published, to select the subset of relevant features useful to explain the behaviour of the system under investigation and the subset of irrelevant predictors that can be ignored is still an open issue. Here, a new strategy to measure variable importance is introduced, and a wrapper method is proposed for selecting relevant and irrelevant predictors. The variable importance measure is developed grouping the predictors in classes of equivalent features by clustering in the latent space and considering the variations of the goodness of the PLS2 model generated perturbing the block of the predictors. The wrapper method implements stability selection using bootstrap and feature selection. The behaviour of the new variable importance score and its use within the wrapper method are discussed investigating two simulated and one real data set.

中文翻译:

PLS2中的相关和不相关预测变量

当数据中存在相关性和冗余性时,偏最小二乘回归(PLS)主要用于解决回归问题。尽管已经发表了许多有关特征选择和变量重要性的研究,但选择有用的相关特征子集来解释所研究系统的行为,可以忽略的无关预测因子的子集仍然是一个未解决的问题。在这里,介绍了一种衡量变量重要性的新策略,并提出了一种用于选择相关和不相关预测变量的包装方法。通过在潜在空间中进行聚类并考虑扰动预测变量块而生成的PLS2模型的优度变化,开发了可变重要性度量,将预测变量按等价特征类进行分组。wrapper方法使用引导程序和功能选择来实现稳定性选择。讨论了新的可变重要性评分的行为及其在包装方法中的使用,研究了两个模拟数据集和一个真实数据集。
更新日期:2020-04-07
down
wechat
bug