当前位置: X-MOL 学术Bioorg. Med. Chem. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
CYPlebrity: Machine learning models for the prediction of inhibitors of cytochrome P450 enzymes
Bioorganic & Medicinal Chemistry ( IF 3.5 ) Pub Date : 2021-08-28 , DOI: 10.1016/j.bmc.2021.116388
Wojciech Plonka 1 , Conrad Stork 2 , Martin Šícho 3 , Johannes Kirchmair 4
Affiliation  

The vast majority of approved drugs are metabolized by the five major cytochrome P450 (CYP) isozymes, 1A2, 2C9, 2C19, 2D6 and 3A4. Inhibition of CYP isozymes can cause drug-drug interactions with severe pharmacological and toxicological consequences. Computational methods for the fast and reliable prediction of the inhibition of CYP isozymes by small molecules are therefore of high interest and relevance to pharmaceutical companies and a host of other industries, including the cosmetics and agrochemical industries. Today, a large number of machine learning models for predicting the inhibition of the major CYP isozymes by small molecules are available. With this work we aim to go beyond the coverage of existing models, by combining data from several major public and proprietary sources. More specifically, we used up to 18815 compounds with measured bioactivities to train random forest classification models for the individual CYP isozymes. A major advantage of the new data collection over existing ones is the better representation of the minority class, the CYP inhibitors. With the new data collection we achieved inhibitor-to-non-inhibitor ratios in the order of 1:1 (CYP1A2) to 1:3 (CYP2D6). We show that our models reach competitive performance on external data, with Matthews correlation coefficients (MCCs) ranging from 0.62 (CYP2C19) to 0.70 (CYP2D6), and areas under the receiver operating characteristic curve (AUCs) between 0.89 (CYP2C19) and 0.92 (CYPs 2D6 and 3A4). Importantly, the models show a high level of robustness, reflected in a good predictivity also for compounds that are structurally dissimilar to the compounds represented in the training data. The best models presented in this work are freely accessible for academic research via a web service.



中文翻译:

CYPlebrity:用于预测细胞色素 P450 酶抑制剂的机器学习模型

绝大多数已批准的药物由五种主要的细胞色素 P450 (CYP) 同工酶 1A2、2C9、2C19、2D6 和 3A4 代谢。CYP 同工酶的抑制会导致药物相互作用,产生严重的药理学和毒理学后果。因此,用于快速可靠地预测小分子对 CYP 同工酶的抑制作用的计算方法对制药公司和许多其他行业(包括化妆品和农化行业)具有很高的兴趣和相关性。今天,可以使用大量机器学习模型来预测小分子对主要 CYP 同工酶的抑制作用。通过这项工作,我们的目标是通过结合来自几个主要公共和专有来源的数据,超越现有模型的覆盖范围。进一步来说,我们使用了多达 18815 种具有测量生物活性的化合物来训练各个 CYP 同工酶的随机森林分类模型。与现有数据相比,新数据收集的一个主要优势是能够更好地代表少数群体,即 CYP 抑制剂。通过新的数据收集,我们实现了抑制剂与非抑制剂的比例为 1:1 (CYP1A2) 到 1:3 (CYP2D6)。我们表明我们的模型在外部数据上达到了有竞争力的表现,马修斯相关系数 (MCC) 范围从 0.62 (CYP2C19) 到 0.70 (CYP2D6),接收者操作特征曲线下面积 (AUCs) 在 0.89 (CYP2C19) 和 0.92 ( CYP 2D6 和 3A4)。重要的是,这些模型显示出高度的稳健性,对于结构上与训练数据中表示的化合物不同的化合物,也反映在良好的预测性上。这项工作中提出的最佳模型可通过网络服务免费访问用于学术研究。

更新日期:2021-08-29
down
wechat
bug