Model selection challenges with application to multivariate calibration updating methods,Journal of Chemometrics

当前位置： X-MOL 学术 › J. Chemometr. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Model selection challenges with application to multivariate calibration updating methods
Journal of Chemometrics ( IF 1.9 ) Pub Date : 2020-04-27 , DOI: 10.1002/cem.3245
Anit Gurung ₁ , John H. Kalivas ₁

Affiliation

An important issue in multivariate calibration including model updating methods with multiple tuning parameters is selection of final models. Model updating is an adaption process where models are updated from predicting in primary sample and measurement conditions to predict the analyte in new secondary conditions. A single process to select models (tuning parameter values) with satisfactory bias–variance trade‐offs across multiple data sets and modeling methods is challenging. This paper reports on evaluating the consistency of a collection of model quality measures to select models across five near‐infrared (NIR) data sets for three calibration updating approaches. The goal is to formulate a reliable model selection process that is nearly data and model updating method independent. Two of the three model updating approaches require primary and secondary analyte reference values, and the third only needs primary reference values (unlabeled relative to secondary). However, all model selection methods considered do require secondary samples with reference values. It is found that which evaluated model quality measure to use depends on the degree of spectral similarity between primary and secondary spectra as characterized by the indicator of spectral uniqueness measure developed in this paper. From the results presented, more work is needed to better characterize model selection dependency on model quality measures, number of samples and respective inherent compositions (data set‐dependent matrix effects), and tuning parameter ranges.

中文翻译：

应用于多元校准更新方法的模型选择挑战

包括具有多个调整参数的模型更新方法在内的多变量校准中的一个重要问题是最终模型的选择。模型更新是一个适应过程，其中模型从在初级样品和测量条件下的预测更新到在新的次级条件下预测分析物。在多个数据集和建模方法中选择具有令人满意的偏差-方差权衡的模型（调整参数值）的单一过程具有挑战性。本文报告了评估模型质量度量集合的一致性，以从五个近红外 (NIR) 数据集为三种校准更新方法选择模型。目标是制定一个可靠的模型选择过程，该过程几乎与数据和模型更新方法无关。三种模型更新方法中的两种需要初级和次级分析物参考值，第三种只需要初级参考值（相对于次级未标记）。然而，所有考虑的模型选择方法都需要具有参考值的二次样本。发现使用哪种评估模型质量度量取决于初级和次级光谱之间的光谱相似程度，如本文开发的光谱唯一性度量指标所表征的。从所呈现的结果来看，需要做更多的工作来更好地表征模型选择对模型质量度量、样本数量和各自固有组成（数据集相关矩阵效应）以及调整参数范围的依赖性。第三个只需要初级参考值（相对于次级未标记）。然而，所有考虑的模型选择方法都需要具有参考值的二次样本。发现使用哪种评估模型质量度量取决于初级和次级光谱之间的光谱相似程度，如本文开发的光谱唯一性度量指标所表征的。从所呈现的结果来看，需要做更多的工作来更好地表征模型选择对模型质量度量、样本数量和各自固有组成（数据集相关矩阵效应）以及调整参数范围的依赖性。第三个只需要初级参考值（相对于次级未标记）。然而，所有考虑的模型选择方法都需要具有参考值的二次样本。发现使用哪种评估模型质量度量取决于初级和次级光谱之间的光谱相似程度，如本文开发的光谱唯一性度量指标所表征的。从所呈现的结果来看，需要做更多的工作来更好地表征模型选择对模型质量度量、样本数量和各自固有组成（数据集相关矩阵效应）以及调整参数范围的依赖性。发现使用哪种评估模型质量度量取决于初级和次级光谱之间的光谱相似程度，如本文开发的光谱唯一性度量指标所表征的。从所呈现的结果来看，需要做更多的工作来更好地表征模型选择对模型质量度量、样本数量和各自固有组成（数据集相关矩阵效应）以及调整参数范围的依赖性。发现使用哪种评估模型质量度量取决于初级和次级光谱之间的光谱相似程度，如本文开发的光谱唯一性度量指标所表征的。从所呈现的结果来看，需要做更多的工作来更好地表征模型选择对模型质量度量、样本数量和各自固有组成（数据集相关矩阵效应）以及调整参数范围的依赖性。

更新日期：2020-04-27

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11