当前位置: X-MOL 学术J. Chem. Theory Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Assessing Gaussian Process Regression and Permutationally Invariant Polynomial Approaches To Represent High-Dimensional Potential Energy Surfaces
Journal of Chemical Theory and Computation ( IF 5.7 ) Pub Date : 2018-05-30 00:00:00 , DOI: 10.1021/acs.jctc.8b00298
Chen Qu 1 , Qi Yu 1 , Brian L. Van Hoozen 1 , Joel M. Bowman 1 , Rodrigo A. Vargas-Hernández 2
Affiliation  

The mathematical representation of large data sets of electronic energies has seen substantial progress in the past 10 years. The so-called Permutationally Invariant Polynomial (PIP) representation is one established approach. This approach dates from 2003, when a global potential energy surface (PES) for CH5+ was reported using a basis of polynomials that are invariant with respect to the 120 permutations of the five equivalent H atoms. More recently, several approaches from “machine learning” have been applied to fit these large data sets. Gaussian Process (GP) regression is such an approach. Here, we consider the implementation of the (full) GP due to Krems and co-workers, with a modification that renders it permutationally invariant, which we denote by PIP-GP. This modification uses the approach of Guo and co-workers and later extended by Zhang and co-workers, to achieve permutational invariance for neural-network fits. The PIP, GP, and PIP-GP approaches are applied to four case studies for fitting data sets of electronic energies: H3O+, OCHCO+, and H2CO/cis-HCOH/trans-HCOH with the goal of assessing precision, accuracy in normal-mode analysis and barrier heights, and timings. We also report an application to (HCOOH)2, where the full PIP approach is possible but where the PIP-GP one is not feasible. However, by replicating data, which is feasible in this case, the GP approach is able to represent the data with precision comparable to that of the PIP approach. We examine these assessments for varying sizes of data sets in each case to determine the dependence of properties of the fits on the training data size. We conclude with some comments on the different aspects of computational effort of the PIP, GP, and PIP-GP approaches and also challenges these methods face for more “rugged” PESs, exemplified here by H2CO/cis-HCOH/trans-HCOH.

中文翻译:

评估高斯过程回归和置换不变多项式方法以表示高维势能面

在过去的十年中,大型电子能量数据集的数学表示形式取得了长足的进步。所谓的置换不变多项式(PIP)表示是一种已建立的方法。该方法可追溯到2003年,当时CH 5 +的全球势能面(PES)使用多项式的基础报告了γ,该多项式相对于五个等效H原子的120个排列而言是不变的。最近,已经应用了来自“机器学习”的几种方法来适应这些大数据集。高斯过程(GP)回归就是这样一种方法。在这里,我们考虑由于Krems和同事而导致的(完整)GP的实现,并进行了修改,使其保持排列不变,这由PIP-GP表示。此修改使用Guo和同事的方法,后来由Zhang和同事扩展,以实现神经网络拟合的置换不变性。PIP,GP和PIP-GP方法应用于四个案例研究,以拟合电子能量数据集:H 3 O +,OCHCO +和H2 CO /顺式-HCOH /反式-HCOH,目的是评估精密度,正常模式分析的准确性,势垒高度和时间。我们还将向(HCOOH)2报告一个应用,虽然可以使用完整的PIP方法,但不能使用PIP-GP。但是,通过在这种情况下可行的复制数据,GP方法能够以与PIP方法相当的精度表示数据。我们检查每种情况下数据集大小变化的评估,以确定拟合属性对训练数据大小的依赖性。我们以对PIP,GP和PIP-GP方法计算工作的不同方面的一些评论结束,并对这些方法面临的“更坚固” PES提出了挑战,此处以H 2 CO /顺式-HCOH /反式-HCOH为例。
更新日期:2018-05-30
down
wechat
bug