当前位置: X-MOL 学术Anal. Chem. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Visualization of a Machine Learning Framework toward Highly Sensitive Qualitative Analysis by SERS
Analytical Chemistry ( IF 6.7 ) Pub Date : 2022-07-06 , DOI: 10.1021/acs.analchem.2c01450
Si-Heng Luo 1, 2 , Wei-Li Wang 2 , Zhi-Fan Zhou 2 , Yi Xie 3, 4 , Bin Ren 1 , Guo-Kun Liu 2 , Zhong-Qun Tian 1
Affiliation  

Surface-enhanced Raman spectroscopy (SERS), providing near-single-molecule-level fingerprint information, is a powerful tool for the trace analysis of a target in a complicated matrix and is especially facilitated by the development of modern machine learning algorithms. However, both the high demand of mass data and the low interpretability of the mysterious black-box operation significantly limit the well-trained model to real systems in practical applications. Aiming at these two issues, we constructed a novel machine learning algorithm-based framework (Vis-CAD), integrating visual random forest, characteristic amplifier, and data augmentation. The introduction of data augmentation significantly reduced the requirement of mass data, and the visualization of the random forest clearly presented the captured features, by which one was able to determine the reliability of the algorithm. Taking the trace analysis of individual polycyclic aromatic hydrocarbons in a mixture as an example, a trustworthy accuracy no less than 99% was realized under the optimized condition. The visualization of the algorithm framework distinctly demonstrated that the captured feature was well correlated to the characteristic Raman peaks of each individual. Furthermore, the sensitivity toward the trace individual could be improved by least 1 order of magnitude as compared to that with the naked eye. The proposed algorithm distinguished by the lesser demand of mass data and the visualization of the operation process offers a new way for the indestructible application of machine learning algorithms, which would bring push-to-the-limit sensitivity toward the qualitative and quantitative analysis of trace targets, not only in the field of SERS, but also in the much wider spectroscopy world. It is implemented in the Python programming language and is open-source at https://github.com/3331822w/Vis-CAD.

中文翻译:

通过 SERS 对机器学习框架进行高灵敏度定性分析的可视化

表面增强拉曼光谱 (SERS) 提供近乎单分子水平的指纹信息,是对复杂矩阵中的目标进行痕量分析的强大工具,尤其是现代机器学习算法的发展促进了这一发展。然而,海量数据的高需求和神秘的黑盒操作的低可解释性都极大地限制了训练好的模型在实际应用中的真实系统。针对这两个问题,我们构建了一种新颖的基于机器学习算法的框架(Vis-CAD),集成了视觉随机森林、特征放大器和数据增强。数据增强的引入显着降低了对海量数据的需求,随机森林的可视化清晰呈现了捕获的特征,通过它可以确定算法的可靠性。以混合物中单个多环芳烃的痕量分析为例,在优化条件下实现了不低于99%的可信准确度。算法框架的可视化清楚地表明,捕获的特征与每个个体的特征拉曼峰密切相关。此外,与肉眼相比,对痕迹个体的敏感性至少可以提高 1 个数量级。所提出的算法以对海量数据的较少需求和操作过程的可视化为特点,为机器学习算法的坚不可摧的应用提供了新的途径,这将为痕量目标的定性和定量分析带来推到极限的灵敏度,不仅在 SERS 领域,而且在更广泛的光谱学领域。它以 Python 编程语言实现,并在 https://github.com/3331822w/Vis-CAD 开源。
更新日期:2022-07-06
down
wechat
bug