当前位置: X-MOL 学术Metabolomics › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Migrating from partial least squares discriminant analysis to artificial neural networks: a comparison of functionally equivalent visualisation and feature contribution tools using jupyter notebooks.
Metabolomics ( IF 3.6 ) Pub Date : 2020-01-21 , DOI: 10.1007/s11306-020-1640-0
Kevin M Mendez 1 , David I Broadhurst 1 , Stacey N Reinke 1
Affiliation  

INTRODUCTION Metabolomics data is commonly modelled multivariately using partial least squares discriminant analysis (PLS-DA). Its success is primarily due to ease of interpretation, through projection to latent structures, and transparent assessment of feature importance using regression coefficients and Variable Importance in Projection scores. In recent years several non-linear machine learning (ML) methods have grown in popularity but with limited uptake essentially due to convoluted optimisation and interpretation. Artificial neural networks (ANNs) are a non-linear projection-based ML method that share a structural equivalence with PLS, and as such should be amenable to equivalent optimisation and interpretation methods. OBJECTIVES We hypothesise that standardised optimisation, visualisation, evaluation and statistical inference techniques commonly used by metabolomics researchers for PLS-DA can be migrated to a non-linear, single hidden layer, ANN. METHODS We compared a standardised optimisation, visualisation, evaluation and statistical inference techniques workflow for PLS with the proposed ANN workflow. Both workflows were implemented in the Python programming language. All code and results have been made publicly available as Jupyter notebooks on GitHub. RESULTS The migration of the PLS workflow to a non-linear, single hidden layer, ANN was successful. There was a similarity in significant metabolites determined using PLS model coefficients and ANN Connection Weight Approach. CONCLUSION We have shown that it is possible to migrate the standardised PLS-DA workflow to simple non-linear ANNs. This result opens the door for more widespread use and to the investigation of transparent interpretation of more complex ANN architectures.

中文翻译:

从偏最小二乘判别分析迁移到人工神经网络:使用 jupyter 笔记本对功能等效的可视化和特征贡献工具进行比较。

简介 代谢组学数据通常使用偏最小二乘判别分析 (PLS-DA) 进行多变量建模。它的成功主要是由于通过投影到潜在结构以及使用回归系数和投影分数中的变量重要性对特征重要性进行透明评估,易于解释。近年来,几种非线性机器学习 (ML) 方法越来越受欢迎,但由于复杂的优化和解释,其采用率有限。人工神经网络 (ANN) 是一种基于非线性投影的 ML 方法,与 PLS 具有相同的结构,因此应该适用于等效的优化和解释方法。目标 我们假设代谢组学研究人员常用的 PLS-DA 标准化优化、可视化、评估和统计推断技术可以迁移到非线性、单隐藏层 ANN。方法 我们将 PLS 的标准化优化、可视化、评估和统计推断技术工作流程与提议的 ANN 工作流程进行了比较。两个工作流程都是用 Python 编程语言实现的。所有代码和结果均已作为 Jupyter 笔记本在 GitHub 上公开发布。结果 PLS 工作流程成功迁移到非线性单隐藏层 ANN。使用 PLS 模型系数和 ANN 连接权重法确定的重要代谢物具有相似性。结论 我们已经证明,可以将标准化 PLS-DA 工作流程迁移到简单的非线性 ANN。这一结果为更广泛的使用和研究更复杂的人工神经网络架构的透明解释打开了大门。
更新日期:2020-01-22
down
wechat
bug