当前位置: X-MOL 学术BBA Gen. Subj. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Machine learning and ligand binding predictions: A review of data, methods, and obstacles.
Biochimica et Biophysica Acta (BBA) - General Subjects ( IF 3 ) Pub Date : 2020-02-10 , DOI: 10.1016/j.bbagen.2020.129545
Sally R Ellingson 1 , Brian Davis 2 , Jonathan Allen 3
Affiliation  

Computational predictions of ligand binding is a difficult problem, with more accurate methods being extremely computationally expensive. The use of machine learning for drug binding predictions could possibly leverage the use of biomedical big data in exchange for time-intensive simulations. This paper reviews current trends in the use of machine learning for drug binding predictions, data sources to develop machine learning algorithms, and potential problems that may lead to overfitting and ungeneralizable models. A few popular datasets that can be used to develop virtual high-throughput screening models are characterized using spatial statistics to quantify potential biases. We can see from evaluating some common benchmarks that good performance correlates with models with high-predicted bias scores and models with low bias scores do not have much predictive power. A better understanding of the limits of available data sources and how to fix them will lead to more generalizable models that will lead to novel drug discovery.

中文翻译:

机器学习和配体结合预测:数据,方法和障碍的综述。

配体结合的计算预测是一个难题,更精确的方法在计算上极其昂贵。将机器学习用于药物结合预测可能会利用生物医学大数据来交换时间密集的模拟。本文回顾了使用机器学习进行药物结合预测的当前趋势,开发机器学习算法的数据源以及可能导致模型过度拟合和无法泛化的潜在问题。可以使用空间统计量来量化一些可用于开发虚拟高通量筛选模型的流行数据集,以量化潜在偏差。通过评估一些常见的基准,我们可以看出,良好的性能与具有较高偏见得分的模型相关,而具有较低偏见得分的模型没有太多的预测能力。对可用数据源的局限性以及如何解决它们的更好的理解将导致更通用的模型,这将导致新型药物的发现。
更新日期:2020-02-10
down
wechat
bug