当前位置: X-MOL 学术Science › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Predicting reaction performance in C–N cross-coupling using machine learning
Science ( IF 56.9 ) Pub Date : 2018-02-15 , DOI: 10.1126/science.aar5169
Derek T. Ahneman 1 , Jesús G. Estrada 1 , Shishi Lin 2 , Spencer D. Dreher 2 , Abigail G. Doyle 1
Affiliation  

A guide for catalyst choice in the forest Chemists often discover reactions by applying catalysts to a series of simple compounds. Tweaking those reactions to tolerate more structural complexity in pharmaceutical research is time-consuming. Ahneman et al. report that machine learning can help. Using a high-throughput data set, they trained a random forest algorithm to predict which specific palladium catalysts would best tolerate isoxazoles (cyclic structures with an N–O bond) during C–N bond formation. The predictions also helped to guide analysis of the catalyst inhibition mechanism. Science, this issue p. 186 A random forest algorithm trained on high-throughput data predicts which catalysts best tolerate certain heterocycles. Machine learning methods are becoming integral to scientific inquiry in numerous disciplines. We demonstrated that machine learning can be used to predict the performance of a synthetic reaction in multidimensional chemical space using data obtained via high-throughput experimentation. We created scripts to compute and extract atomic, molecular, and vibrational descriptors for the components of a palladium-catalyzed Buchwald-Hartwig cross-coupling of aryl halides with 4-methylaniline in the presence of various potentially inhibitory additives. Using these descriptors as inputs and reaction yield as output, we showed that a random forest algorithm provides significantly improved predictive performance over linear regression analysis. The random forest model was also successfully applied to sparse training sets and out-of-sample prediction, suggesting its value in facilitating adoption of synthetic methodology.

中文翻译:

使用机器学习预测 C–N 交叉耦合中的反应性能

森林中催化剂选择指南 化学家经常通过将催化剂应用于一系列简单化合物来发现反应。在药物研究中调整这些反应以容忍更多的结构复杂性是非常耗时的。阿尼曼等人。报告说机器学习可以提供帮助。使用高通量数据集,他们训练了一种随机森林算法,以预测在 C-N 键形成过程中哪种特定的钯催化剂最能耐受异恶唑(具有 N-O 键的环状结构)。这些预测还有助于指导对催化剂抑制机制的分析。科学,这个问题 p。186 在高通量数据上训练的随机森林算法可预测哪种催化剂最能耐受某些杂环。机器学习方法正成为众多学科科学探究不可或缺的一部分。我们证明了机器学习可用于使用通过高通量实验获得的数据来预测多维化学空间中合成反应的性能。我们创建了脚本来计算和提取钯催化的 Buchwald-Hartwig 交叉偶联芳基卤化物与 4-甲基苯胺在各种潜在抑制添加剂存在下的组分的原子、分子和振动描述符。使用这些描述符作为输入,反应产率作为输出,我们表明随机森林算法提供了比线性回归分析显着提高的预测性能。随机森林模型也成功应用于稀疏训练集和样本外预测,表明其在促进采用合成方法方面的价值。
更新日期:2018-02-15
down
wechat
bug