当前位置: X-MOL 学术J. Chem. Inf. Model. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Task-Specific Scoring Functions for Predicting Ligand Binding Poses and Affinity and for Screening Enrichment
Journal of Chemical Information and Modeling ( IF 5.6 ) Pub Date : 2017-12-20 00:00:00 , DOI: 10.1021/acs.jcim.7b00309
Hossam M. Ashtawy 1 , Nihar R. Mahapatra 1
Affiliation  

Molecular docking, scoring, and virtual screening play an increasingly important role in computer-aided drug discovery. Scoring functions (SFs) are typically employed to predict the binding conformation (docking task), binding affinity (scoring task), and binary activity level (screening task) of ligands against a critical protein target in a disease’s pathway. In most molecular docking software packages available today, a generic binding affinity-based (BA-based) SF is invoked for all three tasks to solve three different, but related, prediction problems. The limited predictive accuracies of such SFs in these three tasks has been a major roadblock toward cost-effective drug discovery. Therefore, in this work, we develop BT-Score, an ensemble machine-learning (ML) SF of boosted decision trees and thousands of predictive descriptors to estimate BA. BT-Score reproduced BA of out-of-sample test complexes with correlation of 0.825. Even with this high accuracy in the scoring task, we demonstrate that the docking and screening performance of BT-Score and other BA-based SFs is far from ideal. This has motivated us to build two task-specific ML SFs for the docking and screening problems. We propose BT-Dock, a boosted-tree ensemble model trained on a large number of native and computer-generated ligand conformations and optimized to predict binding poses explicitly. This model has shown an average improvement of 25% over its BA-based counterparts in different ligand pose prediction scenarios. Similar improvement has also been obtained by our screening-based SF, BT-Screen, which directly models the ligand activity labeling task as a classification problem. BT-Screen is trained on thousands of active and inactive protein–ligand complexes to optimize it for finding real actives from databases of ligands not seen in its training set. In addition to the three task-specific SFs, we propose a novel multi-task deep neural network (MT-Net) that is trained on data from the three tasks to simultaneously predict binding poses, affinities, and activity levels. We show that the performance of MT-Net is superior to conventional SFs and on a par with or better than models based on single-task neural networks.

中文翻译:

特定于任务的评分功能,用于预测配体结合姿势和亲和力以及筛选富集

分子对接,评分和虚拟筛选在计算机辅助药物发现中扮演着越来越重要的角色。评分功能(SFs)通常用于预测配体针对疾病途径中关键蛋白靶标的结合构象(对接任务),结合亲和力(得分任务)和二元活性水平(筛选任务)。在当今可用的大多数分子对接软件包中,对所有三个任务都调用了基于通用结合亲和力(基于BA)的SF,以解决三个不同但相关的预测问题。在这三项任务中,此类SF的有限的预测准确性一直是寻求具有成本效益的药物的主要障碍。因此,在这项工作中,我们开发了BT-Score,增强决策树的集成机器学习(ML)SF和数千个预测描述符来估计BA。BT-Score复制了样本外测试复合物的BA,相关系数为0.825。即使在计分任务中具有如此高的准确性,我们也证明了BT-Score和其他基于BA的SF的对接和筛选性能仍远非理想。这促使我们为对接和筛选问题构建两个特定于任务的ML SF。我们建议BT-Dock,这是一种在大量天然和计算机生成的配体构象上训练并经过优化以明确预测结合姿势的增强树集成模型。在不同的配体姿势预测场景中,该模型显示出比其基于BA的对应物平均提高25%。我们基于筛选的SF BT-Screen也获得了类似的改进,它直接将配体活性标记任务建模为分类问题。BT-Screen接受了数千种活性和非活性蛋白-配体复合物的培训,以优化它,以从其训练集中未发现的配体数据库中找到真正的活性物质。除了三个特定于任务的SF之外,我们还提出了一种新颖的多任务深度神经网络(MT-Net)接受来自三个任务的数据训练,以同时预测绑定姿势,亲和力和活动水平。我们表明,MT-Net的性能优于传统SF,并且与基于单任务神经网络的模型相当或更好。
更新日期:2017-12-20
down
wechat
bug