Communications Materials Pub Date : 2020-07-30 , DOI: 10.1038/s43246-020-00052-8 Kan Hatakeyama-Sato , Kenichi Oyaizu
In data-intensive science, machine learning plays a critical role in processing big data. However, the potential of machine learning has been limited in the field of materials science because of the difficulty in treating complex real-world information as a digital language. Here, we propose to use graph-shaped databases with a common format to describe almost any materials science experimental data digitally, including chemical structures, processes, properties, and natural languages. The graphs can express real world’s data with little information loss. In our approach, a single neural network treats the versatile materials science data collected from over ten projects, whereas traditional approaches require individual models to be prepared to process each individual database and property. The multitask learning of miscellaneous factors increases the prediction accuracy of parameters synergistically by acquiring broad knowledge in the field. The integration is beneficial for developing general prediction models and for solving inverse problems in materials science.
中文翻译:
在单个神经网络中集成多个材料科学项目
在数据密集型科学中,机器学习在处理大数据方面起着至关重要的作用。然而,由于难以将复杂的现实世界信息视为数字语言,因此机器学习的潜力在材料科学领域受到了限制。在这里,我们建议使用具有通用格式的图形数据库来以数字方式描述几乎所有材料科学实验数据,包括化学结构,过程,性质和自然语言。这些图可以表达真实世界的数据,而信息损失很少。在我们的方法中,单个神经网络处理从十多个项目中收集的通用材料科学数据,而传统方法则需要准备单独的模型来处理每个单独的数据库和属性。通过获取本领域的广泛知识,对杂项因素进行多任务学习可以协同提高参数的预测精度。集成有利于开发通用的预测模型和解决材料科学中的逆问题。