当前位置: X-MOL 学术Mol. Syst. Des. Eng. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Active learning a coarse-grained neural network model for bulk water from sparse training data
Molecular Systems Design & Engineering ( IF 3.2 ) Pub Date : 2020-03-26 , DOI: 10.1039/c9me00184k
Troy D. Loeffler 1, 2, 3, 4 , Tarak K. Patra 1, 2, 3, 4, 5 , Henry Chan 1, 2, 3, 4, 6 , Subramanian K. R. S. Sankaranarayanan 1, 2, 3, 4, 6
Affiliation  

Neural network (NN) based potentials represent flexible alternatives to pre-defined functional forms. Well-trained NN potentials are transferable and provide a high level of accuracy on-par with the reference model used for training. Despite their tremendous potential and interest in them, there are at least two challenges that need to be addressed – (1) NN models are interpolative, and hence trained by generating large quantities (∼104 or greater) of structural data in hopes that the model has adequately sampled the energy landscape both near and far-from-equilibrium. It is desirable to minimize the number of training data, especially if the underlying reference model is expensive. (2) NN atomistic potentials (like any other classical atomistic model) are limited in the time scales they can access. Coarse-grained NN potentials have emerged as a viable alternative. Here, we address these challenges by introducing an active learning scheme that trains a CG model with a minimal amount of training data. Our active learning workflow starts with a sparse training data set (∼1 to 5 data points), which is continually updated via a nested ensemble Monte Carlo scheme that iteratively queries the energy landscape in regions of failure and improves the network performance. We demonstrate that with ∼300 reference data, our AL-NN is able to accurately predict both the energies and the molecular forces of water, within 2 meV per molecule and 40 meV Å−1 of the reference (coarse-grained bond-order potential) model. The AL-NN water model provides good prediction of several structural, thermodynamic, and temperature dependent properties of liquid water, with values close to those obtained from the reference model. The AL-NN also captures the well-known density anomaly of liquid water observed in experiments. Although the AL procedure has been demonstrated for training CG models with sparse reference data, it can be easily extended to develop atomistic NN models against a minimal amount of high-fidelity first-principles data.

中文翻译:

从稀疏训练数据中主动学习散装水的粗粒度神经网络模型

基于神经网络(NN)的电位代表了预定义功能形式的灵活替代方案。训练有素的NN电位可以转移,并且与用于训练的参考模型具有同等的高精度。尽管它们具有巨大的潜力和兴趣,但至少需要解决两个挑战-(1)NN模型是内插的,因此通过生成大量模型进行训练(〜10 4或更多的结构数据,希望该模型已充分采样了接近和远离平衡的能量分布。希望最大程度地减少训练数据的数量,尤其是在基础参考模型昂贵的情况下。(2)NN原子势(与任何其他经典原子模型一样)在它们可以访问的时间范围内受到限制。粗粒度的NN电位已成为可行的替代方法。在这里,我们通过引入一种主动学习方案来解决这些挑战,该方案使用最少的训练数据来训练CG模型。我们的主动学习工作流程从稀疏的训练数据集(约1至5个数据点)开始,并通过以下方式不断更新一种嵌套的集成蒙特卡洛方案,该方案迭代地查询故障区域中的能源状况,并提高了网络性能。我们证明,利用约300个参考数据,我们的AL-NN能够准确预测水的能量和分子力,每个分子2 meV以内且40 meVÅ -1参考(粗粒度键序势)模型。AL-NN水模型可以很好地预测液态水的几种与结构,热力学和温度有关的特性,其值接近于从参考模型获得的值。AL-NN还捕获了实验中观察到的众所周知的液态水密度异常。尽管已证明可以使用稀疏参考数据训练CG模型的AL方法,但可以针对最少量的高保真第一性原理数据轻松扩展该方法,以开发原子性NN模型。
更新日期:2020-03-26
down
wechat
bug