当前位置: X-MOL 学术BMC Bioinform. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
MSpectraAI: a powerful platform for deciphering proteome profiling of multi-tumor mass spectrometry data by using deep neural networks
BMC Bioinformatics ( IF 3 ) Pub Date : 2020-10-07 , DOI: 10.1186/s12859-020-03783-0
Shisheng Wang , Hongwen Zhu , Hu Zhou , Jingqiu Cheng , Hao Yang

Mass spectrometry (MS) has become a promising analytical technique to acquire proteomics information for the characterization of biological samples. Nevertheless, most studies focus on the final proteins identified through a suite of algorithms by using partial MS spectra to compare with the sequence database, while the pattern recognition and classification of raw mass-spectrometric data remain unresolved. We developed an open-source and comprehensive platform, named MSpectraAI, for analyzing large-scale MS data through deep neural networks (DNNs); this system involves spectral-feature swath extraction, classification, and visualization. Moreover, this platform allows users to create their own DNN model by using Keras. To evaluate this tool, we collected the publicly available proteomics datasets of six tumor types (a total of 7,997,805 mass spectra) from the ProteomeXchange consortium and classified the samples based on the spectra profiling. The results suggest that MSpectraAI can distinguish different types of samples based on the fingerprint spectrum and achieve better prediction accuracy in MS1 level (average 0.967). This study deciphers proteome profiling of raw mass spectrometry data and broadens the promising application of the classification and prediction of proteomics data from multi-tumor samples using deep learning methods. MSpectraAI also shows a better performance compared to the other classical machine learning approaches.

中文翻译:

MSpectraAI:使用深度神经网络破译多肿瘤质谱数据的蛋白质组分析的强大平台

质谱(MS)已成为一种有前途的分析技术,可获取蛋白质组学信息以表征生物样品。但是,大多数研究集中在通过一套算法通过使用部分MS光谱与序列数据库进行比较而识别出的最终蛋白质,而模式识别和原始质谱数据的分类仍未解决。我们开发了一个名为MSpectraAI的开源综合平台,用于通过深度神经网络(DNN)分析大规模MS数据;该系统涉及光谱特征条带的提取,分类和可视化。此外,该平台允许用户使用Keras创建自己的DNN模型。为了评估该工具,我们收集了六种肿瘤类型(共7,997种,ProteomeXchange联盟的805质谱),并根据光谱分析对样品进行分类。结果表明,MSpectraAI可以根据指纹谱区分不同类型的样本,并在MS1水平上获得更好的预测准确性(平均0.967)。这项研究破译了原始质谱数据的蛋白质组图谱,并拓宽了使用深度学习方法对来自多肿瘤样品的蛋白质组学数据进行分类和预测的有希望的应用。与其他经典的机器学习方法相比,MSpectraAI还显示出更好的性能。结果表明,MSpectraAI可以根据指纹谱区分不同类型的样本,并在MS1水平上获得更好的预测准确性(平均0.967)。这项研究破译了原始质谱数据的蛋白质组图谱,并拓宽了使用深度学习方法对来自多肿瘤样品的蛋白质组学数据进行分类和预测的有希望的应用。与其他经典的机器学习方法相比,MSpectraAI还显示出更好的性能。结果表明,MSpectraAI可以根据指纹谱区分不同类型的样本,并在MS1水平上获得更好的预测准确性(平均0.967)。这项研究破译了原始质谱数据的蛋白质组图谱,并拓宽了使用深度学习方法对来自多肿瘤样品的蛋白质组学数据进行分类和预测的有希望的应用。与其他经典的机器学习方法相比,MSpectraAI还显示出更好的性能。
更新日期:2020-10-07
down
wechat
bug