当前位置: X-MOL 学术bioRxiv. Syst. Biol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Deep learning the collisional cross sections of the peptide universe from a million training samples
bioRxiv - Systems Biology Pub Date : 2020-05-21 , DOI: 10.1101/2020.05.19.102285
Florian Meier , Niklas D. Köhler , Andreas-David Brunner , Jean-Marc H. Wanka , Eugenia Voytik , Maximilian T. Strauss , Fabian J. Theis , Matthias Mann

The size and shape of peptide ions in the gas phase are an under-explored dimension for mass spectrometry-based proteomics. To explore the nature and utility of the entire peptide collisional cross section (CCS) space, we measure more than a million data points from whole-proteome digests of five organisms with trapped ion mobility spectrometry (TIMS) and parallel accumulation — serial fragmentation (PASEF). The scale and precision (CV <1%) of our data is sufficient to train a deep recurrent neural network that accurately predicts CCS values solely based on the peptide sequence. Cross section predictions for the synthetic ProteomeTools library validate the model within a 1.3% median relative error (R > 0.99). Hydrophobicity, position of prolines and histidines are main determinants of the cross sections in addition to sequence-specific interactions. CCS values can now be predicted for any peptide and organism, forming a basis for advanced proteomics workflows that make full use of the additional information.

中文翻译:

从一百万个训练样本中深度学习肽宇宙的碰撞截面

气相中肽离子的大小和形状对于基于质谱的蛋白质组学来说是一个探索不足的维度。为了探索整个肽段碰撞横截面(CCS)空间的性质和实用性,我们使用捕获离子迁移谱法(TIMS)和平行累积-串联片段化(PASEF)测量了来自五种生物的全蛋白质组摘要的一百万个数据点)。我们数据的规模和精度(CV <1%)足以训练一个深度递归神经网络,该网络仅根据肽序列即可准确预测CCS值。合成ProteomeTools库的横截面预测可在1.3%中值相对误差(R> 0.99)内验证模型。疏水性 除了序列特异性相互作用外,脯氨酸和组氨酸的位置也是横截面的主要决定因素。现在可以预测任何肽和生物体的CCS值,从而为充分利用附加信息的高级蛋白质组学工作流程奠定了基础。
更新日期:2020-05-21
down
wechat
bug