当前位置: X-MOL 学术J. Cheminfom. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
The effect of noise on the predictive limit of QSAR models
Journal of Cheminformatics ( IF 7.1 ) Pub Date : 2021-11-25 , DOI: 10.1186/s13321-021-00571-7
Scott S Kolmar 1 , Christopher M Grulke 1
Affiliation  

A key challenge in the field of Quantitative Structure Activity Relationships (QSAR) is how to effectively treat experimental error in the training and evaluation of computational models. It is often assumed in the field of QSAR that models cannot produce predictions which are more accurate than their training data. Additionally, it is implicitly assumed, by necessity, that data points in test sets or validation sets do not contain error, and that each data point is a population mean. This work proposes the hypothesis that QSAR models can make predictions which are more accurate than their training data and that the error-free test set assumption leads to a significant misevaluation of model performance. This work used 8 datasets with six different common QSAR endpoints, because different endpoints should have different amounts of experimental error associated with varying complexity of the measurements. Up to 15 levels of simulated Gaussian distributed random error was added to the datasets, and models were built on the error laden datasets using five different algorithms. The models were trained on the error laden data, evaluated on error-laden test sets, and evaluated on error-free test sets. The results show that for each level of added error, the RMSE for evaluation on the error free test sets was always better. The results support the hypothesis that, at least under the conditions of Gaussian distributed random error, QSAR models can make predictions which are more accurate than their training data, and that the evaluation of models on error laden test and validation sets may give a flawed measure of model performance. These results have implications for how QSAR models are evaluated, especially for disciplines where experimental error is very large, such as in computational toxicology.

中文翻译:

噪声对 QSAR 模型预测极限的影响

定量结构活动关系(QSAR)领域的一个关键挑战是如何有效地处理计算模型训练和评估中的实验误差。在 QSAR 领域中,通常假设模型无法产生比其训练数据更准确的预测。此外,根据需要,隐含假设测试集或验证集中的数据点不包含错误,并且每个数据点都是总体平均值。这项工作提出了一个假设,即 QSAR 模型可以做出比其训练数据更准确的预测,并且无错误测试集假设会导致对模型性能的严重错误评估。这项工作使用了 8 个具有六个不同常见 QSAR 端点的数据集,因为不同的端点应该有不同数量的实验误差,这与测量的不同复杂性相关。将多达 15 个级别的模拟高斯分布随机误差添加到数据集中,并使用五种不同的算法在包含误差的数据集上建立模型。模型在包含错误的数据上进行训练,在包含错误的测试集上进行评估,并在无错误的测试集上进行评估。结果表明,对于每个级别的添加错误,用于评估无错误测试集的 RMSE 总是更好。结果支持假设,至少在高斯分布随机误差的条件下,QSAR 模型可以做出比其训练数据更准确的预测,并且在错误负载测试和验证集上评估模型可能会给出有缺陷的模型性能度量。这些结果对如何评估 QSAR 模型具有影响,尤其是对于实验误差非常大的学科,例如计算毒理学。
更新日期:2021-11-25
down
wechat
bug