当前位置: X-MOL 学术Environ. Sci.: Nano › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
How to judge whether QSAR/read-across predictions can be trusted: a novel approach for establishing a model's applicability domain†
Environmental Science: Nano ( IF 5.8 ) Pub Date : 2017-12-08 00:00:00 , DOI: 10.1039/c7en00774d
A. Gajewicz 1, 2, 3, 4, 5
Affiliation  

The EU REACH legislation, the OECD and US EPA official guidance documents, as well as the 3Rs principle (replacement, reduction, refinement of animal testing), all advocate the necessity of developing comprehensive computational methods (e.g. quantitative structure–activity relationship, read-across) that would enable the predictive modeling of both chemical (e.g. nanoparticle) specific functionalities and their hazards. However, since computational (nano)toxicology continues to ‘learn on the fly’ and relies on the use of a vast array of innovative machine-learning algorithms, serious concerns about the reliability of in silico predictions are raised. This study aimed to give an answer to the following question: how to judge whether QSAR/read-across predictions are reliable. Here, an effective approach for graphical assessment of the limits of a model's reliable predictions (so-called applicability domain, AD) was introduced. The probability-oriented distance-based approach (ADProbDist) was proposed as a robust and automatic method for defining the interpolation space where true and reliable predictions can be expected. Its usefulness was confirmed by using four nano-QSAR/read-across models recently reported in the literature. The results of the study showed that the ADProbDist approach is more restrictive in terms of the chemical space that falls in the AD of a model than the range, geometrical, distance and leverage approaches. The advantages of the proposed ADProbDist approach include (but are not limited to) the fact that it works with relatively small datasets and enables the identification of (un)reliable predictions for newly screened chemicals without experimental data. Further, to facilitate the use of the ADProbDist approach, this study provides the developed in-house R-codes.

中文翻译:

如何判断QSAR /交叉预测是否值得信任:建立模型适用性域的新颖方法

欧盟REACH法规,经济合作与发展组织(OECD)和美国环保署(EPA)官方指南文件以及3R原则(替代,减少,完善动物试验),都主张开发综合计算方法的必要性(例如,定量结构与活性之间的关系,跨)将能够对化学(例如纳米颗粒)特定功能及其危害进行预测性建模。但是,由于计算(纳米)毒理学一直在“实时学习”并依赖于使用大量创新的机器学习算法,因此对计算机模拟的可靠性产生严重的担忧提出了预测。这项研究旨在回答以下问题:如何判断QSAR /交叉阅读预测是否可靠。在这里,引入了一种有效的方法来对模型的可靠预测的限制进行图形化评估(所谓的适用性域,AD)。提出了一种基于概率的基于距离的方法(AD ProbDist),该方法是一种可靠的自动方法,用于定义可以期望真实可靠预测的插值空间。通过使用最近在文献中报道的四个纳米QSAR /交叉模型,证实了其有用性。研究结果表明,AD ProbDist与范围,几何,距离和杠杆方法相比,方法对模型AD中的化学空间的限制更大。所提出的AD ProbDist方法的优点包括(但不限于)以下事实:它可以使用相对较小的数据集,并且可以为没有实验数据的新筛选化学品识别(不可靠)预测。此外,为促进AD ProbDist方法的使用,本研究提供了已开发的内部R代码。
更新日期:2017-12-08
down
wechat
bug