当前位置: X-MOL 学术J. Chem. Inf. Model. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Perturbation-Theory and Machine Learning (PTML) Model for High-Throughput Screening of Parham Reactions: Experimental and Theoretical Studies
Journal of Chemical Information and Modeling ( IF 5.6 ) Pub Date : 2018-06-13 00:00:00 , DOI: 10.1021/acs.jcim.8b00286
Lorena Simón-Vidal 1 , Oihane García-Calvo 1 , Uxue Oteo 1 , Sonia Arrasate 1 , Esther Lete 1 , Nuria Sotomayor 1 , Humberto González-Díaz 1, 2
Affiliation  

Machine learning (ML) algorithms are gaining importance in the processing of chemical information and modeling of chemical reactivity problems. In this work, we have developed a perturbation-theory and machine learning (PTML) model combining perturbation theory (PT) and ML algorithms for predicting the yield of a given reaction. For this purpose, we have selected Parham cyclization, which is a general and powerful tool for the synthesis of heterocyclic and carbocyclic compounds. This reaction has both structural (substitution pattern on the substrate, internal electrophile, ring size, etc.) and operational variables (organolithium reagent, solvent, temperature, time, etc.), so predicting the effect of changes on substrate design (internal elelctrophile, halide, etc.) or reaction conditions on the yield is an important task that could help to optimize the reaction design. The PTML model developed uses PT operators to account for perturbations under experimental conditions and/or structural variables of all the molecules involved in a query reaction, compared to a reaction of reference. Thus, a dataset of >100 reactions has been collected for different substrates and internal electrophiles, under different reaction conditions, with a wide range of yields (0–98%). The best PTML model found using General Linear Regression (GLR) has R = 0.88 in training and R = 0.83 in external validation series for 10 000 pairs of query and reference reactions. The PTML model has a final R = 0.95 for all reactions using multiple reactions of reference. We also report a comparative study of linear versus nonlinear PTML models based on artificial neural network (ANN) algorithms. PTML-ANN models (LNN, MLP, RBF) with R ≈ 0.1–0.8 do not outperform the first PMTL model. This result confirms the validity of the linearity of the model. Next, we carried out an experimental and theoretical study of nonreported Parham reactions to illustrate the practical use of the PTML model. A 500 000-point simulation and a Hammett analysis of the reactivity space of Parham reactions are also reported.

中文翻译:

Parham反应高通量筛选的扰动理论和机器学习(PTML)模型:实验和理论研究

机器学习(ML)算法在化学信息处理和化学反应性问题建模中正变得越来越重要。在这项工作中,我们开发了一种结合了扰动理论(PT)和ML算法的扰动理论和机器学习(PTML)模型,以预测给定反应的产率。为此,我们选择了帕拉姆环化,这是一种用于合成杂环和碳环化合物的通用且功能强大的工具。该反应同时具有结构(底物上的取代模式,内部亲电体,环大小等)和操作变量(有机锂试剂,溶剂,温度,时间等),因此可以预测变化对底物设计的影响(内部亲电体) ,卤化物等。)或反应条件对产率的影响是一项重要任务,可以帮助优化反应设计。与参考反应相比,开发的PTML模型使用PT算子来解释查询条件中涉及的所有分子在实验条件下的扰动和/或所有分子的结构变量。因此,在不同的反应条件下,针对不同的底物和内部亲电试剂,已收集了超过100个反应的数据集,其收率范围很广(0–98%)。使用通用线性回归(GLR)找到的最佳PTML模型具有 在不同的反应条件下,已针对不同的底物和内部亲电试剂收集了100个反应,收率范围很广(0–98%)。使用通用线性回归(GLR)找到的最佳PTML模型具有 在不同的反应条件下,已针对不同的底物和内部亲电试剂收集了100个反应,收率范围很广(0–98%)。使用通用线性回归(GLR)找到的最佳PTML模型具有对于10,000对查询和参考反应,训练中的R = 0.88,外部验证系列中的R = 0.83。对于使用多个参考反应的所有反应,PTML模型的最终R = 0.95。我们还报告了基于人工神经网络(ANN)算法的线性与非线性PTML模型的比较研究。PTML-ANN模型(LNN,MLP,RBF)与[R ≈0.1-0.8不超越第一PMTL模型。该结果证实了模型线性的有效性。接下来,我们对未报告的帕勒姆反应进行了实验和理论研究,以说明PTML模型的实际应用。还报道了帕勒姆反应的反应空间的500,000点模拟和Hammett分析。
更新日期:2018-06-13
down
wechat
bug