当前位置: X-MOL 学术J. Cheminfom. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Decrypting orphan GPCR drug discovery via multitask learning
Journal of Cheminformatics ( IF 7.1 ) Pub Date : 2024-01-23 , DOI: 10.1186/s13321-024-00806-3
Wei-Cheng Huang, Wei-Ting Lin, Ming-Shiu Hung, Jinq-Chyi Lee, Chun-Wei Tung

The drug discovery of G protein-coupled receptors (GPCRs) superfamily using computational models is often limited by the availability of protein three-dimensional (3D) structures and chemicals with experimentally measured bioactivities. Orphan GPCRs without known ligands further complicate the process. To enable drug discovery for human orphan GPCRs, multitask models were proposed for predicting half maximal effective concentrations (EC50) of the pairs of chemicals and GPCRs. Protein multiple sequence alignment features, and physicochemical properties and fingerprints of chemicals were utilized to encode the protein and chemical information, respectively. The protein features enabled the transfer of data-rich GPCRs to orphan receptors and the transferability based on the similarity of protein features. The final model was trained using both agonist and antagonist data from 200 GPCRs and showed an excellent mean squared error (MSE) of 0.24 in the validation dataset. An independent test using the orphan dataset consisting of 16 receptors associated with less than 8 bioactivities showed a reasonably good MSE of 1.51 that can be further improved to 0.53 by considering the transferability based on protein features. The informative features were identified and mapped to corresponding 3D structures to gain insights into the mechanism of GPCR-ligand interactions across the GPCR family. The proposed method provides a novel perspective on learning ligand bioactivity within the diverse human GPCR superfamily and can potentially accelerate the discovery of therapeutic agents for orphan GPCRs.

中文翻译:


通过多任务学习解密孤儿 GPCR 药物发现



使用计算模型进行 G 蛋白偶联受体 (GPCR) 超家族的药物发现通常受到蛋白质三维 (3D) 结构和具有实验测量生物活性的化学物质的可用性的限制。没有已知配体的孤儿 GPCR 使该过程进一步复杂化。为了实现人类孤儿 GPCR 的药物发现,提出了多任务模型来预测化学物质和 GPCR 对的半最大有效浓度 (EC50)。利用蛋白质多序列比对特征以及化学物质的理化性质和指纹图谱分别编码蛋白质和化学信息。蛋白质特征使数据丰富的 GPCR 能够转移到孤儿受体,并实现基于蛋白质特征相似性的可转移性。最终模型使用来自 200 个 GPCR 的激动剂和拮抗剂数据进行训练,并在验证数据集中显示出 0.24 的优异均方误差 (MSE)。使用由与少于 8 种生物活性相关的 16 个受体组成的孤儿数据集进行的独立测试显示,MSE 相当不错,为 1.51,通过考虑基于蛋白质特征的可转移性,可以将其进一步提高到 0.53。信息特征被识别并映射到相应的 3D 结构,以深入了解 GPCR 家族中 GPCR-配体相互作用的机制。所提出的方法为学习不同人类 GPCR 超家族中的配体生物活性提供了新的视角,并有可能加速孤儿 GPCR 治疗药物的发现。
更新日期:2024-01-23
down
wechat
bug