当前位置: X-MOL 学术J. Chem. Inf. Model. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Uncertainty-Quantified Hybrid Machine Learning/Density Functional Theory High Throughput Screening Method for Crystals.
Journal of Chemical Information and Modeling ( IF 5.6 ) Pub Date : 2020-03-25 , DOI: 10.1021/acs.jcim.0c00003
Juhwan Noh 1 , Geun Ho Gu 1 , Sungwon Kim 1 , Yousung Jung 1, 2
Affiliation  

Computational high throughput screening (HTS) has emerged as a significant tool in material science to accelerate the discovery of new materials with target properties in recent years. However, despite many successful cases in which HTS led to the novel discovery, currently, the major bottleneck in HTS is a large computational cost of density functional theory (DFT) calculations that scale cubically with system size, limiting the chemical space that can be explored. The present work aims at addressing this computational burden of HTS by presenting a machine learning (ML) framework that can efficiently explore the chemical space. Our model is built upon an existing crystal graph convolutional neural network (CGCNN) to obtain formation energy of a crystal structure but is modified to allow uncertainty quantification for each prediction using the hyperbolic tangent activation function and dropout algorithm (CGCNN-HD). The uncertainty quantification is particularly important since typical usage of CGCNN (due to the lack of gradient implementation) does not involve structural relaxation which could cause substantial prediction errors. The proposed method is benchmarked against an existing application that identified promising photoanode material among the >7,000 hypothetical Mg-Mn-O ternary compounds using all DFT-HTS. In our approach, we perform the approximate HTS using CGCNN-HD and refine the results using full DFT for those selected (denoted as ML/DFT-HTS). The proposed hybrid model reduces the required DFT calculations by a factor of >50 compared to the previous DFT-HTS in making the same discovery of Mg2MnO4, experimentally validated new photoanode material. Further analysis demonstrates that the addition of HD components with uncertainty measures in the CGCNN-HD model increased the discoverability of promising materials relative to all DFT-HTS from 30% (CGCNN) to 68% (CGCNN-HD). The present ML/DFT-HTS with uncertainty quantification can thus be a fast alternative to DFT-HTS for efficient exploration of the vast chemical space.

中文翻译:

不确定度量化的混合机器学习/密度泛函理论高通量晶体筛选方法。

近年来,计算高通量筛选(HTS)已成为材料科学中的重要工具,以加速发现具有目标特性的新材料。但是,尽管HTS导致了许多成功的发现,但目前,HTS的主要瓶颈是密度泛函理论(DFT)计算的大量计算成本,该计算成本与系统大小成立方比例,从而限制了可以探索的化学空间。本工作旨在通过提出一种可以有效探索化学空间的机器学习(ML)框架来解决HTS的这一计算负担。我们的模型基于现有的晶体图卷积神经网络(CGCNN)获得晶体结构的形成能,但经过修改后可以使用双曲正切激活函数和辍学算法(CGCNN-HD)对每个预测进行不确定性量化。不确定性量化特别重要,因为CGCNN的典型用法(由于缺少梯度实现)不涉及结构松弛,结构松弛可能会导致重大的预测误差。所提出的方法是以现有应用为基准的,该现有应用使用所有DFT-HTS在7,000多种假设的Mg-Mn-O三元化合物中确定了有希望的光阳极材料。在我们的方法中,我们使用CGCNN-HD执行近似的HTS,并使用完整DFT对选定的结果进行优化(表示为ML / DFT-HTS)。与以前的DFT-HTS相比,在进行Mg2MnO4的相同发现,实验验证的新型光阳极材料方面,所提出的混合模型将所需的DFT计算量减少了50倍以上。进一步的分析表明,相对于所有DFT-HTS,在CGCNN-HD模型中添加具有不确定性量度的HD成分使有前途的材料的发现能力从30%(CGCNN)增加到68%(CGCNN-HD)。因此,具有不确定性定量的当前ML / DFT-HTS可以作为DFT-HTS的快速替代品,以有效地探索广阔的化学空间。进一步的分析表明,相对于所有DFT-HTS,在CGCNN-HD模型中添加具有不确定性量度的HD成分使有前途的材料的发现能力从30%(CGCNN)增加到68%(CGCNN-HD)。因此,具有不确定性定量的当前ML / DFT-HTS可以作为DFT-HTS的快速替代品,以有效地探索广阔的化学空间。进一步的分析表明,相对于所有DFT-HTS,在CGCNN-HD模型中添加具有不确定性量度的HD成分使有前途的材料的发现率从30%(CGCNN)增加到68%(CGCNN-HD)。因此,具有不确定性定量的当前ML / DFT-HTS可以作为DFT-HTS的快速替代品,以有效地探索广阔的化学空间。
更新日期:2020-03-25
down
wechat
bug