当前位置: X-MOL 学术J. Cheminfom. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Charged aerosol detector response modeling for fatty acids based on experimental settings and molecular features: a machine learning approach
Journal of Cheminformatics ( IF 8.6 ) Pub Date : 2021-07-15 , DOI: 10.1186/s13321-021-00532-0
Ruben Pawellek 1 , Jovana Krmar 2 , Adrian Leistner 1 , Nevena Djajić 2 , Biljana Otašević 2 , Ana Protić 2 , Ulrike Holzgrabe 1
Affiliation  

The charged aerosol detector (CAD) is the latest representative of aerosol-based detectors that generate a response independent of the analytes’ chemical structure. This study was aimed at accurately predicting the CAD response of homologous fatty acids under varying experimental conditions. Fatty acids from C12 to C18 were used as model substances due to semivolatile characterics that caused non-uniform CAD behaviour. Considering both experimental conditions and molecular descriptors, a mixed quantitative structure–property relationship (QSPR) modeling was performed using Gradient Boosted Trees (GBT). The ensemble of 10 decisions trees (learning rate set at 0.55, the maximal depth set at 5, and the sample rate set at 1.0) was able to explain approximately 99% (Q2: 0.987, RMSE: 0.051) of the observed variance in CAD responses. Validation using an external test compound confirmed the high predictive ability of the model established (R2: 0.990, RMSEP: 0.050). With respect to the intrinsic attribute selection strategy, GBT used almost all independent variables during model building. Finally, it attributed the highest importance to the power function value, the flow rate of the mobile phase, evaporation temperature, the content of the organic solvent in the mobile phase and the molecular descriptors such as molecular weight (MW), Radial Distribution Function—080/weighted by mass (RDF080m) and average coefficient of the last eigenvector from distance/detour matrix (Ve2_D/Dt). The identification of the factors most relevant to the CAD responsiveness has contributed to a better understanding of the underlying mechanisms of signal generation. An increased CAD response that was obtained for acetone as organic modifier demonstrated its potential to replace the more expensive and environmentally harmful acetonitrile.

中文翻译:

基于实验设置和分子特征的脂肪酸带电气溶胶检测器响应建模:一种机器学习方法

带电气溶胶检测器 (CAD) 是基于气溶胶的检测器的最新代表,可产生独立于分析物化学结构的响应。本研究旨在准确预测不同实验条件下同源脂肪酸的 CAD 反应。由于半挥发性特征导致不均匀的 CAD 行为,C12 到 C18 的脂肪酸被用作模型物质。考虑到实验条件和分子描述符,使用梯度增强树 (GBT) 进行了混合定量结构-性质关系 (QSPR) 建模。10 个决策树的集合(学习率设置为 0.55,最大深度设置为 5,采样率设置为 1.0)能够解释大约 99%(Q2:0.987,RMSE:0.051)在 CAD 中观察到的方差回应。使用外部测试化合物的验证证实了所建立模型的高预测能力(R2:0.990,RMSEP:0.050)。对于内在属性选择策略,GBT 在模型构建过程中使用了几乎所有的自变量。最后,将最重要的是幂函数值、流动相的流速、蒸发温度、流动相中有机溶剂的含量和分子描述符,如分子量(MW)、径向分布函数—— 080/质量加权 (RDF080m) 和距离/绕行矩阵的最后一个特征向量的平均系数 (Ve2_D/Dt)。确定与 CAD 响应性最相关的因素有助于更好地理解信号生成的潜在机制。
更新日期:2021-07-15
down
wechat
bug