当前位置: X-MOL 学术J. Supercrit. Fluids › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Prediction of partition coefficient in high-pressure carbon dioxide–water systems using machine learning
The Journal of Supercritical Fluids ( IF 3.4 ) Pub Date : 2021-09-24 , DOI: 10.1016/j.supflu.2021.105421
Tatsuya Fujii 1 , Marina Kobune 1, 2
Affiliation  

A method of predicting the partition coefficients (log K) of organic compounds in high-pressure carbon dioxide–water systems using machine learning was investigated. Using the collected literature data of log K, several linear and non-linear regression models were constructed. A cross-validation using these models indicated that log K can be approximately predicted with a root-mean-squared error of 0.6–1.1. Overall, the non-linear model predicted log K better, but linear models such as lasso regression exhibited comparable performances when features describing compounds were reduced. Parity plots indicated several outliers, most of which contain several polar functional groups. The analysis of feature importance revealed that the constructed model primarily consisted of the feature of log P (the 1-octanol–water partition coefficient) and was modified using process parameters and features related to the charges of compounds. The machine-learning approach yields a physicochemically reasonable model.



中文翻译:

使用机器学习预测高压二氧化碳-水系统中的分配系数

研究了一种使用机器学习预测高压二氧化碳-水系统中有机化合物分配系数 (log K ) 的方法。使用收集到的 log K 的文献数据,构建了几个线性和非线性回归模型。使用这些模型的交叉验证表明,log K可以近似预测,均方根误差为 0.6-1.1。总体而言,非线性模型预测 log K更好,但是当描述化合物的特征减少时,诸如套索回归之类的线性模型表现出相当的性能。奇偶图显示了几个异常值,其中大部分包含几个极性官能团。特征重要性分析表明,构建的模型主要由 log P特征(1-辛醇-水分配系数)组成,并使用工艺参数和与化合物电荷相关的特征进行修改。机器学习方法产生了一个物理化学上合理的模型。

更新日期:2021-10-01
down
wechat
bug