当前位置: X-MOL 学术J. Chem. Inf. Model. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Multiclassification Prediction of Enzymatic Reactions for Oxidoreductases and Hydrolases Using Reaction Fingerprints and Machine Learning Methods
Journal of Chemical Information and Modeling ( IF 5.6 ) Pub Date : 2018-05-07 00:00:00 , DOI: 10.1021/acs.jcim.7b00656
Yingchun Cai 1 , Hongbin Yang 1 , Weihua Li 1 , Guixia Liu 1 , Philip W. Lee 1 , Yun Tang 1
Affiliation  

Drug metabolism is a complex procedure in the human body, including a series of enzymatically catalyzed reactions. However, it is costly and time consuming to investigate drug metabolism experimentally; computational methods are hence developed to predict drug metabolism and have shown great advantages. As the first step, classification of metabolic reactions and enzymes is highly desirable for drug metabolism prediction. In this study, we developed multiclassification models for prediction of reaction types catalyzed by oxidoreductases and hydrolases, in which three reaction fingerprints were used to describe the reactions and seven machine learnings algorithms were employed for model building. Data retrieved from KEGG containing 1055 hydrolysis and 2510 redox reactions were used to build the models, respectively. The external validation data consisted of 213 hydrolysis and 512 redox reactions extracted from the Rhea database. The best models were built by neural network or logistic regression with a 2048-bit transformation reaction fingerprint. The predictive accuracies of the main class, subclass, and superclass classification models on external validation sets were all above 90%. This study will be very helpful for enzymatic reaction annotation and further study on metabolism prediction.

中文翻译:

使用反应指纹和机器学习方法对氧化还原酶和水解酶的酶促反应进行多分类预测

药物代谢在人体中是一个复杂的过程,包括一系列酶催化的反应。然而,通过实验研究药物代谢是昂贵且费时的。因此,开发了计算方法来预测药物代谢并显示出很大的优势。第一步,对于药物代谢预测,非常需要对代谢反应和酶进行分类。在这项研究中,我们开发了用于预测氧化还原酶和水解酶催化反应类型的多分类模型,其中三个反应指纹用于描述反应,七种机器学习算法用于模型构建。从KEGG检索的包含1055水解和2510氧化还原反应的数据分别用于构建模型。外部验证数据包括从Rhea数据库中提取的213个水解反应和512个氧化还原反应。最佳模型是通过具有2048位转换反应指纹的神经网络或逻辑回归建立的。外部验证集上的主类,子类和超类分类模型的预测准确性均在90%以上。这项研究对酶促反应的注释和对代谢预测的进一步研究将非常有帮助。
更新日期:2018-05-07
down
wechat
bug