当前位置: X-MOL 学术 › Chinese Journal of Chemical Physics › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Crystallographic groups prediction from chemical composition via deep learning
Chinese Journal of Chemical Physics Pub Date : 2023-03-21 , DOI: 10.1063/1674-0068/cjcp2107124
Da-yong Wang 1 , Hai-feng Lv 1 , Xiao-jun Wu 1
Affiliation  

Crystallographic group is an important character to describe the crystal structure, but it is difficult to identify the crystallographic group of crystal when only chemical composition is given. Here, we present a machine-learning method to predict the crystallographic group of crystal structure from its chemical formula. 34528 stable compounds in 230 crystallographic groups are investigated, of which 72% of data set are used as training set, 8% as validation set, and 20% as test set. Based on the results of machine learning, we present a model which can obtain correct crystallographic group in the top-1, top-5, and top-10 results with the estimated accuracy of 60.8%, 76.5%, and 82.6%, respectively. In particular, the performance of deep-learning model presents high generalization through comparison between validation set and test set. Additionally, 230 crystallographic groups are classified into 19 new labels, denoting 18 heavily represented crystallographic groups with each containing more than 400 compounds and one combination group of remaining compounds in other 212 crystallographic groups. A deep-learning model trained on 19 new labels yields a promising result to identify crystallographic group with the estimated accuracy of 72.2%. Our results provide a promising approach to identify crystallographic group of crystal structures only from their chemical composition.

中文翻译:

通过深度学习从化学成分预测晶体组

晶族是描述晶体结构的重要特征,但仅给出化学成分很难确定晶体的晶族。在这里,我们提出了一种机器学习方法,可以根据其化学式预测晶体结构的晶群。研究了230个结晶组的34528个稳定化合物,其中72%的数据集作为训练集,8%作为验证集,20%作为测试集。基于机器学习的结果,我们提出了一个模型,可以在 top-1、top-5 和 top-10 结果中获得正确的晶体组,估计准确度分别为 60.8%、76.5% 和 82.6%。特别是,通过验证集和测试集的比较,深度学习模型的性能呈现出高度泛化性。此外,230 个晶体组被分类为 19 个新标签,表示 18 个具有代表性的晶体组,每个晶体组包含超过 400 种化合物和一组其他 212 个晶体组中剩余化合物的组合组。在 19 个新标签上训练的深度学习模型产生了一个有希望的结果来识别晶体组,估计准确度为 72.2%。我们的结果提供了一种有前途的方法来仅从它们的化学成分来识别晶体结构的晶体学组。在 19 个新标签上训练的深度学习模型产生了一个有希望的结果来识别晶体组,估计准确度为 72.2%。我们的结果提供了一种有前途的方法来仅从它们的化学成分来识别晶体结构的晶体学组。在 19 个新标签上训练的深度学习模型产生了一个有希望的结果来识别晶体组,估计准确度为 72.2%。我们的结果提供了一种有前途的方法来仅从它们的化学成分来识别晶体结构的晶体学组。
更新日期:2023-03-22
down
wechat
bug