当前位置: X-MOL 学术J. Comput. Aid. Mol. Des. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Overview of the SAMPL6 p K a challenge: evaluating small molecule microscopic and macroscopic p K a predictions
Journal of Computer-Aided Molecular Design ( IF 3.0 ) Pub Date : 2021-01-04 , DOI: 10.1007/s10822-020-00362-6
Mehtap Işık 1, 2 , Ariën S Rustenburg 1, 3 , Andrea Rizzi 1, 4 , M R Gunner 5 , David L Mobley 6 , John D Chodera 1
Affiliation  

The prediction of acid dissociation constants (pKa) is a prerequisite for predicting many other properties of a small molecule, such as its protein–ligand binding affinity, distribution coefficient (log D), membrane permeability, and solubility. The prediction of each of these properties requires knowledge of the relevant protonation states and solution free energy penalties of each state. The SAMPL6 pKa Challenge was the first time that a separate challenge was conducted for evaluating pKa predictions as part of the Statistical Assessment of Modeling of Proteins and Ligands (SAMPL) exercises. This challenge was motivated by significant inaccuracies observed in prior physical property prediction challenges, such as the SAMPL5 log D Challenge, caused by protonation state and pKa prediction issues. The goal of the pKa challenge was to assess the performance of contemporary pKa prediction methods for drug-like molecules. The challenge set was composed of 24 small molecules that resembled fragments of kinase inhibitors, a number of which were multiprotic. Eleven research groups contributed blind predictions for a total of 37 pKa distinct prediction methods. In addition to blinded submissions, four widely used pKa prediction methods were included in the analysis as reference methods. Collecting both microscopic and macroscopic pKa predictions allowed in-depth evaluation of pKa prediction performance. This article highlights deficiencies of typical pKa prediction evaluation approaches when the distinction between microscopic and macroscopic pKas is ignored; in particular, we suggest more stringent evaluation criteria for microscopic and macroscopic pKa predictions guided by the available experimental data. Top-performing submissions for macroscopic pKa predictions achieved RMSE of 0.7–1.0 pKa units and included both quantum chemical and empirical approaches, where the total number of extra or missing macroscopic pKas predicted by these submissions were fewer than 8 for 24 molecules. A large number of submissions had RMSE spanning 1–3 pKa units. Molecules with sulfur-containing heterocycles or iodo and bromo groups were less accurately predicted on average considering all methods evaluated. For a subset of molecules, we utilized experimentally-determined microstates based on NMR to evaluate the dominant tautomer predictions for each macroscopic state. Prediction of dominant tautomers was a major source of error for microscopic pKa predictions, especially errors in charged tautomers. The degree of inaccuracy in pKa predictions observed in this challenge is detrimental to the protein-ligand binding affinity predictions due to errors in dominant protonation state predictions and the calculation of free energy corrections for multiple protonation states. Underestimation of ligand pKa by 1 unit can lead to errors in binding free energy errors up to 1.2 kcal/mol. The SAMPL6 pKa Challenge demonstrated the need for improving pKa prediction methods for drug-like molecules, especially for challenging moieties and multiprotic molecules.



中文翻译:

SAMPL6 p K a 挑战概述:评估小分子微观和宏观 p K a 预测

酸解离常数 (p Ka )的预测是预测小分子许多其他特性的先决条件,例如其蛋白质-配体结合亲和力、分配系数 (log  D )、膜渗透性和溶解度。对这些属性中的每一个的预测都需要了解相关的质子化状态和每个状态的解自由能惩罚。SAMPL6 p K a挑战赛是首次进行单独的挑战来评估 p K a预测,作为蛋白质和配体建模统计评估 (SAMPL) 练习的一部分。这一挑战的起因是在之前的物理性质预测挑战(例如 SAMPL5 log  D挑战)中观察到的显着不准确,这是由质子化状态和 p K a预测问题引起的。p K a挑战的目标是评估当代类药物分子p K a预测方法的性能。挑战集由 24 个类似于激酶抑制剂片段的小分子组成,其中许多是多质子的。11 个研究小组对总共 37 种不同的预测方法进行了预测除了盲法提交之外,分析中还包括四种广泛使用的 p K a预测方法作为参考方法。收集微观和宏观 p K a预测可以深入评估 p K a预测性能。本文强调了当忽略微观和宏观 p K a s之间的区别时,典型 p K a预测评估方法的缺陷;特别是,我们建议在现有实验数据的指导下对微观和宏观 p K a预测进行更严格的评估标准。宏观pKa预测表现最好的提交作品的RMSE 为 0.7-1.0 pKa单位,并且包括量子化学和经验方法,其中这些提交内容预测的额外或缺失的宏观pKa总数少于 8为24个分子。大量提交的 RMSE 范围为 1-3 p K a单位。考虑到所有评估的方法,平均而言,具有含硫杂环或碘和溴基团的分子预测不太准确。对于分子的子集,我们利用基于 NMR 的实验确定的微观状态来评估每个宏观状态的主要互变异构体预测。显性互变异构体的预测是微观pKa预测误差的主要来源,尤其是带电互变异构体的误差。由于主要质子化状态预测和多个质子化状态的自由能校正计算中的错误,在此挑战中观察到的 p K a预测的不准确程度对蛋白质-配体结合亲和力预测是有害的。低估配体 p Ka 1个单位可能会导致结合自由能误差高达 1.2 kcal/mol。SAMPL6 p K a挑战表明需要改进类药物分子的p K a预测方法,特别是对于具有挑战性的部分和多质子分子。

更新日期:2021-01-04
down
wechat
bug