当前位置: X-MOL 学术J. Near Infrared Spectrosc. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Sample selection, calibration and validation of models developed from a large dataset of near infrared spectra of tree leaves
Journal of Near Infrared Spectroscopy ( IF 1.8 ) Pub Date : 2020-03-13 , DOI: 10.1177/0967033520902536
Jessie Au 1, 2 , Kara N Youngentob 2 , William J Foley 2 , Ben D Moore 3 , Tom Fearn 1
Affiliation  

Near infrared spectroscopy is widely used to rapidly and cost-effectively collect chemical information from plant samples. Large datasets with hundreds to thousands of spectra and reference values are increasingly becoming more common as researchers accumulate data over many years or across research groups. These datasets potentially contain great spectral and chemical variation and could produce a broadly-applicable calibration model. In this study, partial least squares regression was used to model relationships between near infrared spectra and the foliar concentration of two ecologically-important chemical traits, available nitrogen and total formylated phloroglucinol compounds in Eucalyptus leaves. The nested spatial structure within the extensive dataset of spectra and reference values from 80 species of Eucalyptus was taken into account during calibration development and model validation. Geographic variation amongst samples influenced how well available nitrogen could be predicted. Predictive error of the model was greatest when tested against samples from different Australian states and local government areas to the calibration set. In addition, the results showed that simply relying on spectral variation (assessed by Mahalanobis distance) may mislead researchers into how many reference values are needed. The prediction accuracy of the model of available nitrogen differed little whether 300 or up to 987 calibration samples were included, which indicated that an excessive number of reference values were obtained. Lastly, a suitable multi-species calibration for formylated phloroglucinol compounds was produced and the difficulties associated with predicting complex chemical traits were discussed. Directing effort towards broadly applicable models will encourage sharing of calibration models across projects and research groups and facilitate the integration of near infrared spectroscopy in many research fields.

中文翻译:

从树叶近红外光谱的大型数据集开发的模型的样本选择、校准和验证

近红外光谱被广泛用于从植物样本中快速、经济地收集化学信息。随着研究人员多年来或跨研究小组积累数据,具有成百上千个光谱和参考值的大型数据集正变得越来越普遍。这些数据集可能包含很大的光谱和化学变化,可以产生一个广泛适用的校准模型。在这项研究中,偏最小二乘回归用于模拟近红外光谱与桉树叶中两种重要生态化学性状(有效氮和总甲酰化间苯三酚化合物)的叶浓度之间的关系。在校准开发和模型验证过程中,考虑了来自 80 种桉树的广泛光谱和参考值数据集中的嵌套空间结构。样本之间的地理差异影响了可用氮的预测程度。当针对来自不同澳大利亚州和地方政府区域的样本对校准集进行测试时,模型的预测误差最大。此外,结果表明,仅仅依靠光谱变化(通过马氏距离评估)可能会误导研究人员需要多少参考值。无论是纳入300个还是最多987个校准样品,可用氮模型的预测精度差异不大,这表明获得了过多的参考值。最后,对甲酰化间苯三酚化合物进行了合适的多物种校准,并讨论了与预测复杂化学特征相关的困难。致力于广泛适用的模型将鼓励跨项目和研究小组共享校准模型,并促进近红外光谱在许多研究领域的整合。
更新日期:2020-03-13
down
wechat
bug