当前位置: X-MOL 学术Mol. Pharmaceutics › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Adapting Deep Learning QSPR Models to Specific Drug Discovery Projects
Molecular Pharmaceutics ( IF 4.9 ) Pub Date : 2024-02-19 , DOI: 10.1021/acs.molpharmaceut.3c01124
Andrin Fluetsch 1 , Elena Di Lascio 1 , Grégori Gerebtzoff 1 , Raquel Rodríguez-Pérez 1
Affiliation  

Medicinal chemistry and drug design efforts can be assisted by machine learning (ML) models that relate the molecular structure to compound properties. Such quantitative structure–property relationship models are generally trained on large data sets that include diverse chemical series (global models). In the pharmaceutical industry, these ML global models are available across discovery projects as an “out-of-the-box” solution to assist in drug design, synthesis prioritization, and experiment selection. However, drug discovery projects typically focus on confined parts of the chemical space (e.g., chemical series), where global models might not be applicable. Local ML models are sometimes generated to focus on specific projects or series. Herein, ML-based global models, local models, and hybrid global-local strategies were benchmarked. Analyses were done for more than 300 drug discovery projects at Novartis and ten absorption, distribution, metabolism, and excretion (ADME) assays. In this work, hybrid global-local strategies based on transfer learning approaches were proposed to leverage both historical ADME data (global) and project-specific data (local) to adapt model predictions. Fine-tuning a pretrained global ML model (used for weights’ initialization, WI) was the top-performing method. Average improvements of mean absolute errors across all assays were 16% and 27% compared with global and local models, respectively. Interestingly, when the effect of training set size was analyzed, WI fine-tuning was found to be successful even in low-data scenarios (e.g., ∼10 molecules per project). Taken together, this work highlights the potential of domain adaptation in the field of molecular property predictions to refine existing pretrained models on a new compound data distribution.

中文翻译:

将深度学习 QSPR 模型应用于特定的药物发现项目

将分子结构与化合物特性联系起来的机器学习 (ML) 模型可以帮助药物化学和药物设计工作。这种定量结构-性质关系模型通常在包含不同化学系列(全局模型)的大型数据集上进行训练。在制药行业,这些 ML 全局模型可作为“开箱即用”的解决方案跨发现项目使用,以协助药物设计、合成优先级和实验选择。然而,药物发现项目通常关注化学空间的有限部分(例如化学系列),全局模型可能不适用。有时会生成本地 ML 模型以专注于特定项目或系列。在此,对基于机器学习的全局模型、局部模型和混合全局-局部策略进行了基准测试。对诺华 300 多个药物发现项目和 10 项吸收、分布、代谢和排泄 (ADME) 测定进行了分析。在这项工作中,提出了基于迁移学习方法的混合全球-本地策略,以利用历史 ADME 数据(全球)和项目特定数据(本地)来适应模型预测。微调预训练的全局 ML 模型(用于权重初始化,WI)是性能最好的方法。与全局和局部模型相比,所有检测的平均绝对误差平均改善分别为 16% 和 27%。有趣的是,当分析训练集大小的影响时,发现即使在低数据场景(例如每个项目约 10 个分子)下,WI 微调也是成功的。总而言之,这项工作凸显了分子特性预测领域领域适应的潜力,可以在新的化合物数据分布上完善现有的预训练模型。
更新日期:2024-02-19
down
wechat
bug