当前位置: X-MOL 学术J. Cheminfom. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
QSAR-derived affinity fingerprints (part 1): fingerprint construction and modeling performance for similarity searching, bioactivity classification and scaffold hopping
Journal of Cheminformatics ( IF 7.1 ) Pub Date : 2020-05-29 , DOI: 10.1186/s13321-020-00443-6
C. Škuta , I. Cortés-Ciriano , W. Dehaen , P. Kříž , G. J. P. van Westen , I. V. Tetko , A. Bender , D. Svozil

An affinity fingerprint is the vector consisting of compound’s affinity or potency against the reference panel of protein targets. Here, we present the QAFFP fingerprint, 440 elements long in silico QSAR-based affinity fingerprint, components of which are predicted by Random Forest regression models trained on bioactivity data from the ChEMBL database. Both real-valued (rv-QAFFP) and binary (b-QAFFP) versions of the QAFFP fingerprint were implemented and their performance in similarity searching, biological activity classification and scaffold hopping was assessed and compared to that of the 1024 bits long Morgan2 fingerprint (the RDKit implementation of the ECFP4 fingerprint). In both similarity searching and biological activity classification, the QAFFP fingerprint yields retrieval rates, measured by AUC (~ 0.65 and ~ 0.70 for similarity searching depending on data sets, and ~ 0.85 for classification) and EF5 (~ 4.67 and ~ 5.82 for similarity searching depending on data sets, and ~ 2.10 for classification), comparable to that of the Morgan2 fingerprint (similarity searching AUC of ~ 0.57 and ~ 0.66, and EF5 of ~ 4.09 and ~ 6.41, depending on data sets, classification AUC of ~ 0.87, and EF5 of ~ 2.16). However, the QAFFP fingerprint outperforms the Morgan2 fingerprint in scaffold hopping as it is able to retrieve 1146 out of existing 1749 scaffolds, while the Morgan2 fingerprint reveals only 864 scaffolds.

中文翻译:

QSAR衍生的亲和指纹(第1部分):相似性搜索,生物活性分类和支架跳跃的指纹构建和建模性能

亲和指纹是由化合物对蛋白质靶标参考面板的亲和力或效能组成的载体。在这里,我们介绍QAFFP指纹,这是440个长的基于硅QSAR的亲和力指纹图谱,其成分是通过根据来自ChEMBL数据库的生物活性数据训练的随机森林回归模型预测的。实施了QAFFP指纹的实值(rv-QAFFP)和二进制(b-QAFFP)版本,并评估了它们在相似性搜索,生物活性分类和支架跳跃中的性能,并与1024位长的Morgan2指纹进行了比较( ECFP4指纹的RDKit实现)。在相似性搜索和生物活性分类中,QAFFP指纹均产生以AUC(〜0.65和〜0)衡量的检索率。70取决于数据集的相似性搜索,分类为〜0.85)和EF5(取决于数据集的相似性搜索为〜4.67和〜5.82,分类为〜2.10),与Morgan2指纹的相似性(相似性搜索AUC) 〜0.57和〜0.66,EF5为〜4.09和〜6.41,取决于数据集,分类AUC为〜0.87和EF5为〜2.16)。但是,QAFFP指纹图谱在脚手架跳跃中胜过Morgan2指纹图,因为它能够从现有的1749个脚手架中检索1146个,而Morgan2指纹图谱仅显示864个脚手架。和EF5分别为〜4.09和〜6.41,取决于数据集,分类AUC为〜0.87和EF5为〜2.16)。但是,QAFFP指纹图谱在脚手架跳跃中胜过Morgan2指纹图,因为它能够从现有的1749个脚手架中检索1146个,而Morgan2指纹图谱仅显示864个脚手架。和EF5分别为〜4.09和〜6.41,取决于数据集,分类AUC为〜0.87和EF5为〜2.16)。但是,QAFFP指纹图谱在脚手架跳跃中胜过Morgan2指纹图,因为它能够从现有的1749个脚手架中检索1146个,而Morgan2指纹图谱仅显示864个脚手架。
更新日期:2020-05-29
down
wechat
bug