当前位置: X-MOL 学术J. Cheminfom. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Large scale comparison of QSAR and conformal prediction methods and their applications in drug discovery.
Journal of Cheminformatics ( IF 8.6 ) Pub Date : 2019-01-10 , DOI: 10.1186/s13321-018-0325-4
Nicolas Bosc 1 , Francis Atkinson 1 , Eloy Felix 1 , Anna Gaulton 1 , Anne Hersey 1 , Andrew R Leach 1
Affiliation  

Structure–activity relationship modelling is frequently used in the early stage of drug discovery to assess the activity of a compound on one or several targets, and can also be used to assess the interaction of compounds with liability targets. QSAR models have been used for these and related applications over many years, with good success. Conformal prediction is a relatively new QSAR approach that provides information on the certainty of a prediction, and so helps in decision-making. However, it is not always clear how best to make use of this additional information. In this article, we describe a case study that directly compares conformal prediction with traditional QSAR methods for large-scale predictions of target-ligand binding. The ChEMBL database was used to extract a data set comprising data from 550 human protein targets with different bioactivity profiles. For each target, a QSAR model and a conformal predictor were trained and their results compared. The models were then evaluated on new data published since the original models were built to simulate a “real world” application. The comparative study highlights the similarities between the two techniques but also some differences that it is important to bear in mind when the methods are used in practical drug discovery applications.

中文翻译:

QSAR和保形预测方法的大规模比较及其在药物发现中的应用。

在药物开发的早期阶段,经常使用结构-活性关系建模来评估化合物对一个或多个靶标的活性,也可以用于评估化合物与责任靶标的相互作用。QSAR模型已经在这些及相关应用中使用了很多年,并取得了成功。保形预测是一种相对较新的QSAR方法,可提供有关预测确定性的信息,因此有助于决策。但是,并不总是很清楚如何最好地利用此附加信息。在本文中,我们描述了一个案例研究,该案例直接将保形预测与传统QSAR方法进行大规模的目标配体结合预测。ChEMBL数据库用于提取数据集,该数据集包含来自具有不同生物活性特征的550个人类蛋白质靶标的数据。对于每个目标,训练了QSAR模型和保形预测变量,并比较了它们的结果。由于原始模型是为了模拟“现实世界”的应用程序而建立的,因此将根据发布的新数据对模型进行评估。对比研究强调了两种技术之间的相似之处,但也强调了一些区别,在实际药物发现应用中使用这些方法时要牢记。
更新日期:2019-01-10
down
wechat
bug