Adaptive Bottom-Up Space Exploration in model population analysis: An agile variable selection algorithm for PLS models,Chemometrics and Intelligent Laboratory Systems

当前位置： X-MOL 学术 › Chemometr. Intell. Lab. Systems › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Adaptive Bottom-Up Space Exploration in model population analysis: An agile variable selection algorithm for PLS models
Chemometrics and Intelligent Laboratory Systems ( IF 3.7 ) Pub Date : 2020-08-01 , DOI: 10.1016/j.chemolab.2020.104057
Biswanath Mahanty

Abstract All variable selection algorithms for partial least squares (PLS) regression models based on model population analysis, including variable iterative space shrinkage approach (VISSA), take an iterative top-down space shrinkage approach. The time efficiency of this unidirectional VISSA is not promising as it drains much of the valuable time while evaluating sub-models of irrelevant size while shrinking variable space in a step-wise manner. In this work, two variants of Adaptive Bottom-Up Space Exploration (ABUSE) approach have been proposed. Both variants of ABUSE, based on VISSA framework, adopts a low weight (e.g. 0.005) of variable selection to formulate short length sub-model populations. When average fitness of sub-model population stops improving in reweighted sampling, the weight of variables appearing in “best-fit” sub-model is increased to 0.5, while maintaining others (but not expunged like VISSA) at the same low selection frequency for next round of iteration. The first algorithmic variant enforces the weight vector manipulation once in a binary matrix sampling, while the other enforces this weight manipulation in every cycle of weighted binary matrix sampling. The proposed methods offered better fitness, outcome stability and algorithmic efficiency particularly for large benchmark NIR data sets. Choice of variable weight is critical as at higher weight, though still better than VISSA, the algorithm tends to project more variables with a deterioration of model fitness.

中文翻译：

模型种群分析中的自适应自下而上空间探索：PLS 模型的敏捷变量选择算法

摘要所有基于模型种群分析的偏最小二乘（PLS）回归模型的变量选择算法，包括变量迭代空间收缩方法（VISSA），都采用迭代自顶向下空间收缩方法。这种单向 VISSA 的时间效率并不乐观，因为它在评估不相关大小的子模型时消耗了大量宝贵的时间，同时以逐步的方式缩小变量空间。在这项工作中，提出了自适应自下而上空间探索 (ABUSE) 方法的两种变体。ABUSE 的两种变体都基于 VISSA 框架，采用低权重（例如 0.005）的变量选择来制定短长度子模型种群。当子模型种群的平均适应度在重新加权抽样中停止提高时，出现在“最佳拟合”子模型中的变量的权重增加到 0.5，同时将其他变量（但不像 VISSA 一样被删除）保持在相同的低选择频率以供下一轮迭代使用。第一个算法变体在二进制矩阵采样中强制执行一次权重向量操作，而另一个算法在加权二进制矩阵采样的每个周期中强制执行此权重操作。所提出的方法提供了更好的适应度、结果稳定性和算法效率，特别是对于大型基准 NIR 数据集。变量权重的选择很关键，因为在更高的权重下，虽然仍然比 VISSA 更好，但该算法往往会随着模型适应度的恶化而投射更多变量。第一个算法变体在二进制矩阵采样中强制执行一次权重向量操作，而另一个算法在加权二进制矩阵采样的每个周期中强制执行此权重操作。所提出的方法提供了更好的适应度、结果稳定性和算法效率，特别是对于大型基准 NIR 数据集。变量权重的选择很关键，因为在更高的权重下，虽然仍然比 VISSA 更好，但该算法往往会随着模型适应度的恶化而投射更多的变量。第一个算法变体在二进制矩阵采样中强制执行一次权重向量操作，而另一个算法在加权二进制矩阵采样的每个周期中强制执行此权重操作。所提出的方法提供了更好的适应度、结果稳定性和算法效率，特别是对于大型基准 NIR 数据集。变量权重的选择很关键，因为在更高的权重下，虽然仍然比 VISSA 更好，但该算法往往会随着模型适应度的恶化而投射更多的变量。

更新日期：2020-08-01

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11