当前位置: X-MOL 学术SAR QSAR Environ. Res. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Predicting gas phase entropy of select hydrocarbon classes through specific information-theoretical molecular descriptors.
SAR and QSAR in Environmental Research ( IF 3 ) Pub Date : 2019-06-20 , DOI: 10.1080/1062936x.2019.1624613
C Raychaudhury 1 , I H Rizvi 1 , D Pal 1
Affiliation  

The usefulness of five specific information-theoretical molecular descriptors was investigated for predicting the gas phase entropy of selected classes of acyclic and cyclic compounds. Among them, total information on atomic number (TIZ), graph vertex complexity (HV) and total information on bonds (TIBAT), considered together showed the best correlation along with a low standard deviation (r2 = 0.97, s = 21.14) with gas phase entropy values of 130 compounds. The multiple regression equation treating these three indices as independent variables was statistically highly significant which was evident from the F-statistics. In particular, very small difference between r2 and r2-pred values indicates that the regression model is not overfitted and is, therefore, suitable for prediction purposes. When truly used as a training set to predict (from regression equation) 40 additional compounds we get a very high correlation (r2 = 0.975), which remains almost identical (r2 = 0.97) for the combined data set of 170 compounds. The three indices appear to be useful descriptors producing correlation that remains stable with the change in the size of the data set. Also, the information-theoretical measures appear to capture an additive-cum-constitutive nature of gas phase entropy yielding an acceptable statistical fit.



中文翻译:

通过特定的信息理论分子描述符预测所选烃类的气相熵。

研究了五种特定的信息理论分子描述符对预测所选非环状和环状化合物类别的气相熵的有用性。其中,原子序数(TI Z)的总信息,图顶点复杂度(H V)和键的总信息(TIB AT)一起考虑,显示出最佳的相关性以及较低的标准偏差(r 2 = 0.97,s = 21.14)的气相熵值为130种化合物。将这三个指标作为自变量的多元回归方程在统计上非常显着,这从F中可以明显看出-统计。特别地,r 2r 2 -pred值之间的很小差异表明回归模型未过度拟合,因此适合于预测目的。当真正用作训练集来预测(根据回归方程)40种其他化合物时,我们得到非常高的相关性(r 2 = 0.975),该相关性几乎相同(r 2= 170个化合物的组合数据集= 0.97)。这三个索引似乎是有用的描述符,可以产生相关性,并且随着数据集大小的变化而保持稳定。同样,信息理论方法似乎捕获了气相熵的累加和本构性质,从而产生了可接受的统计拟合。

更新日期:2019-06-20
down
wechat
bug