TMLRpred: A machine learning classification model to distinguish reversible EGFR double mutant inhibitors,Chemical Biology & Drug Design

当前位置： X-MOL 学术 › Chem. Bio. Drug Des. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

TMLRpred: A machine learning classification model to distinguish reversible EGFR double mutant inhibitors
Chemical Biology & Drug Design ( IF 3.2 ) Pub Date : 2020-10-15 , DOI: 10.1111/cbdd.13697
Ravi Saini ₁ , Shehnaz Fatima ₂ , Subhash Mohan Agarwal ₂

Affiliation

The EGFR is a clinically important therapeutic drug target in lung cancer. The first‐generation tyrosine kinase inhibitors used in clinics are effective against L858R‐mutated EGFR. However, relapse of the disease due to the presence of resistant mutation (T790M) makes these inhibitors ineffective. This has necessitated the need to identify new potent EGFR inhibitors against the resistant double mutants. Therefore, various machine learning techniques ((instance‐based learner (IBK), naïve Bayesian (NB), sequential minimal optimization (SMO), and random forest (RF)) were employed to develop twelve classification models on three different datasets (high, moderate, and weakly active inhibitors). The models were validated using fivefold cross‐validation and independent validation datasets. It was observed that the random forest‐based models showed best performance. Also, functional groups, PubChem fingerprints, and substructure of highly active inhibitors were compared to inactive to identify structural features which are important for activity. To promote open‐source drug discovery, a tool has been developed, which incorporates the best performing models and allows users to predict the potential of chemical molecules as anti‐TMLR inhibitor. It is expected that the machine learning classification models developed in this study will pave way for identifying novel inhibitors against the resistant EGFR double mutants.

中文翻译：

TMLRpred：区分可逆的EGFR双突变抑制剂的机器学习分类模型

EGFR是肺癌中临床上重要的治疗药物靶标。临床上使用的第一代酪氨酸激酶抑制剂对L858R突变的EGFR有效。但是，由于存在抗药性突变（T790M）而导致疾病复发，使这些抑制剂无效。这就需要确定针对抗性双突变体的新型有效EGFR抑制剂。因此，采用了多种机器学习技术（（基于实例的学习器（IBK），朴素的贝叶斯（NB），顺序最小优化（SMO）和随机森林（RF））在三个不同的数据集（高，中度和弱活性抑制剂）。使用五重交叉验证和独立验证数据集验证了模型。据观察，基于森林的随机模型显示出最佳性能。同样，将官能团，PubChem指纹和高活性抑制剂的亚结构与无活性进行了比较，以确定对活性重要的结构特征。为了促进开源药物的发现，已经开发了一种工具，该工具结合了性能最佳的模型，并允许用户预测化学分子作为抗TMLR抑制剂的潜力。预期在这项研究中开发的机器学习分类模型将为鉴定针对抗性EGFR双突变体的新型抑制剂铺平道路。为了促进开源药物的发现，已经开发了一种工具，该工具结合了性能最佳的模型，并允许用户预测化学分子作为抗TMLR抑制剂的潜力。预期在这项研究中开发的机器学习分类模型将为鉴定针对抗性EGFR双突变体的新型抑制剂铺平道路。为了促进开源药物的发现，已经开发了一种工具，该工具结合了性能最佳的模型，并允许用户预测化学分子作为抗TMLR抑制剂的潜力。预期在这项研究中开发的机器学习分类模型将为鉴定针对抗性EGFR双突变体的新型抑制剂铺平道路。

更新日期：2020-10-16

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11