当前位置: X-MOL 学术SAR QSAR Environ. Res. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
QSPR modelling of the soil sorption coefficient from training sets of different sizes.
SAR and QSAR in Environmental Research ( IF 2.3 ) Pub Date : 2019-04-15 , DOI: 10.1080/1062936x.2019.1586759
C J M Olguin 1 , S C Sampaio 1 , R R Dos Reis 1 , M B Remor 1 , C F A Olguin 2
Affiliation  

Quantitative structure–property relationship (QSPR) modelling has been used in many scientific fields. This approach has been extensively applied in environmental research to predict physicochemical properties of compounds with potential environmental impact. The soil sorption coefficient is an important parameter for the evaluation of environmental risks, and it helps to determine the final fate of substances in the environment. In the last few years, different QSPR models have been developed for the determination of the sorption coefficient. In this study, several QSPR models were generated and evaluated for the prediction of log Koc from the relationship with log P. These models were obtained from an extensive and diverse training set (n = 639) and from subsets of this initial set (i.e. halves, fourths and eighths). The aim of this study was to investigate whether the size of the training set affects the statistical quality of the obtained models. Furthermore, statistical equivalence was verified between the models obtained from smaller sets and the model obtained from the total training set. The results confirmed the equivalence between the models, thus indicating the possibility of using smaller training sets without compromising the statistical quality and predictive capability, as long as most chemical classes in the test set are represented in the training set.



中文翻译:

来自不同规模的训练集的土壤吸附系数的QSPR建模。

定量结构-性质关系(QSPR)建模已在许多科学领域中使用。该方法已广泛用于环境研究中,以预测具有潜在环境影响的化合物的理化性质。土壤吸附系数是评估环境风险的重要参数,有助于确定环境中物质的最终结局。在最近几年中,已经开发出不同的QSPR模型来确定吸附系数。在这项研究中,生成了多个QSPR模型,并根据与log P的关系评估了log K oc的预测。这些模型是从广泛而多样的训练集中获得的(n= 639)并从该初始集合的子集(即一半,四分之一和八分之一)中提取。这项研究的目的是调查训练集的大小是否会影响所获得模型的统计质量。此外,在从较小集合获得的模型与从总训练集合获得的模型之间验证了统计等效性。结果证实了模型之间的等效性,从而表明了使用较小的训练集而不损害统计质量和预测能力的可能性,只要训练集中可以代表测试集中的大多数化学类别。

更新日期:2019-04-15
down
wechat
bug