当前位置: X-MOL 学术Mach. Learn. Sci. Technol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Exploration of transferable and uniformly accurate neural network interatomic potentials using optimal experimental design
Machine Learning: Science and Technology ( IF 6.3 ) Pub Date : 2021-05-17 , DOI: 10.1088/2632-2153/abe294
Viktor Zaverkin , Johannes Kästner

Machine learning has been proven to have the potential to bridge the gap between the accuracy of ab initio methods and the efficiency of empirical force fields. Neural networks are one of the most frequently used approaches to construct high-dimensional potential energy surfaces. Unfortunately, they lack an inherent uncertainty estimation which is necessary for efficient and automated sampling through the chemical and conformational space to find extrapolative configurations. The identification of the latter is needed for the construction of transferable and uniformly accurate potential energy surfaces. In this paper, we propose an active learning approach that uses the estimated model’s output variance derived in the framework of the optimal experimental design. This method has several advantages compared to the established active learning approaches, e.g. Query-by-Committee, Monte Carlo dropout, feature and latent distances, in terms of the predictive power and computational efficiency. We have shown that the application of the proposed active learning scheme leads to transferable and uniformly accurate potential energy surfaces constructed using only a small fraction of data points. Additionally, it is possible to define a natural threshold value for the proposed uncertainty metric which offers the possibility to generate highly informative training data on-the-fly.



中文翻译:

使用优化实验设计探索可转移且一致精确的神经网络原子间势

机器学习已被证明有潜力弥合ab initio准确性之间的差距方法和经验力场的效率。神经网络是构建高维势能面最常用的方法之一。不幸的是,它们缺乏固有的不确定性估计,这是通过化学和构象空间进行高效和自动采样以找到外推配置所必需的。后者的识别对于构建可传递且均匀准确的势能表面是必需的。在本文中,我们提出了一种主动学习方法,该方法使用在最佳实验设计框架中得出的估计模型的输出方差。与已建立的主动学习方法相比,这种方法有几个优点,例如 Query-by-Committee、Monte Carlo dropout、特征和潜在距离,在预测能力和计算效率方面。我们已经表明,所提出的主动学习方案的应用导致仅使用一小部分数据点构建的可转移且一致准确的势能表面。此外,可以为提议的不确定性度量定义一个自然阈值,这提供了即时生成信息量高的训练数据的可能性。

更新日期:2021-05-17
down
wechat
bug