当前位置: X-MOL 学术Protein J. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
QSSR Modeling of Bacillus Subtilis Lipase A Peptide Collision Cross-Sections in Ion Mobility Spectrometry: Local Descriptor Versus Global Descriptor
The Protein Journal ( IF 1.9 ) Pub Date : 2021-01-16 , DOI: 10.1007/s10930-020-09960-7
Zhong Ni , Anlin Wang , Lingyu Kang , Tiancheng Zhang

To investigate the structure-dependent peptide mobility behavior in ion mobility spectrometry (IMS), quantitative structure-spectrum relationship (QSSR) is systematically modeled and predicted for the collision cross section Ω values of totally 162 single-protonated tripeptide fragments extracted from the Bacillus subtilis lipase A. Two different types of structure characterization methods, namely, local and global descriptor as well as three machine learning methods, namely, partial least squares (PLS), support vector machine (SVM) and Gaussian process (GP), are employed to parameterize and correlate the structures and Ω values of these peptide samples. In this procedure, the local descriptor is derived from the principal component analysis (PCA) of 516 physicochemical properties for 20 standard amino acids, which can be used to sequentially characterize the three amino acid residues composing a tripeptide. The global descriptor is calculated using CODESSA method, which can generate > 200 statistically significant variables to characterize the whole molecular structure of a tripeptide. The obtained QSSR models are evaluated rigorously via tenfold cross-validation and Monte Carlo cross-validation (MCCV). A comprehensive comparison is performed on the resulting statistics arising from the systematic combination of different descriptor types and machine learning methods. It is revealed that the local descriptor-based QSSR models have a better fitting ability and predictive power, but worse interpretability, than those based on the global descriptor. In addition, since the QSSR modeling using local descriptor does not consider the three-dimensional conformation of tripeptide samples, the method would be largely efficient as compared to the global descriptor.



中文翻译:

离子迁移谱中枯草芽孢杆菌脂肪酶A肽碰撞截面的QSSR建模:局部描述符与全局描述符

为了研究离子迁移谱(IMS)中依赖结构的肽迁移行为,系统地建立了定量结构-光谱关系(QSSR)建模并预测了从枯草芽孢杆菌提取的总共162个单质子化三肽片段的碰撞截面Ω值脂肪酶A。采用两种不同类型的结构表征方法,即局部和全局描述符以及三种机器学习方法,即偏最小二乘(PLS),支持向量机(SVM)和高斯过程(GP),参数化和关联这些肽样品的结构和Ω值。在此程序中,局部描述符源自对20个标准氨基酸的516物理化学性质的主成分分析(PCA),可用于顺序表征组成三肽的三个氨基酸残基。全局描述符是使用CODESSA方法计算的,该方法可以生成> 200个具有统计学意义的变量,以表征三肽的整个分子结构。通过十倍交叉验证和蒙特卡洛交叉验证(MCCV)严格评估了获得的QSSR模型。对不同描述符类型和机器学习方法的系统组合产生的统计结果进行全面比较。结果表明,与基于全局描述符的模型相比,基于局部描述符的QSSR模型具有更好的拟合能力和预测能力,但可解释性较差。此外,由于使用局部描述符的QSSR建模未考虑三肽样品的三维构象,因此与全局描述符相比,该方法将非常有效。对不同描述符类型和机器学习方法的系统组合产生的统计结果进行全面比较。结果表明,与基于全局描述符的模型相比,基于局部描述符的QSSR模型具有更好的拟合能力和预测能力,但可解释性较差。此外,由于使用局部描述符的QSSR建模未考虑三肽样品的三维构象,因此与全局描述符相比,该方法将非常有效。对不同描述符类型和机器学习方法的系统组合产生的统计结果进行全面比较。结果表明,与基于全局描述符的模型相比,基于局部描述符的QSSR模型具有更好的拟合能力和预测能力,但可解释性较差。此外,由于使用局部描述符的QSSR建模未考虑三肽样品的三维构象,因此与全局描述符相比,该方法将非常有效。

更新日期:2021-01-18
down
wechat
bug