Skip to main content
Log in

Is Domain Knowledge Necessary for Machine Learning Materials Properties?

  • Technical Article
  • Published:
Integrating Materials and Manufacturing Innovation Aims and scope Submit manuscript

Abstract

New featurization schemes for describing materials as composition vectors in order to predict their properties using machine learning are common in the field of Materials Informatics. However, little is known about the comparative efficacy of these methods. This work sets out to make clear which featurization methods should be used across various circumstances. Our findings include, surprisingly, that simple fractional and random-noise representations of elements can be as effective as traditional and new descriptors when using large amounts of data. However, in the absence of large datasets or for data that is not fully representative, we show that the integration of domain knowledge offers advantages in predictive ability.

Graphical abstract

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Ward L, Agrawal A, Choudhary A, Wolverton C (2016) A general-purpose machine learning framework for predicting properties of inorganic materials. NPJ Comput Mater 2(1):1–7

    Article  Google Scholar 

  2. Meredig B, Antono E, Church C, Hutchinson M, Ling J, Paradiso S, Blaiszik B, Foster I, Gibbons B, Hattrick-Simpers J, Mehta A, Ward L (2018) Can machine learning identify the next high-temperature superconductor? Examining extrapolation performance for materials discovery. Mol Syst Des Eng 3:819–825

    Article  CAS  Google Scholar 

  3. Cao Z, Dan Y, Xiong Z, Niu C, Li X, Qian S, Hu J (2019) Convolutional neural networks for crystal material property prediction using hybrid orbital-field matrix and magpie descriptors. Crystals 9(4):191

    Article  CAS  Google Scholar 

  4. Li X, Dan Y, Dong R, Cao Z, Niu C, Song Y, Li S, Hu J (2019) Computational screening of new perovskite materials using transfer learning and deep learning. Appl Sci 9(24):5510

    Article  CAS  Google Scholar 

  5. Meredig B, Agrawal A, Kirklin S, Saal JE, Doak J, Thompson A, Zhang K, Choudhary A, Wolverton C (2014) Combinatorial screening for new materials in unconstrained composition space with machine learning. Phys Rev B 89(9):094104

    Article  CAS  Google Scholar 

  6. Ramprasad R, Batra R, Pilania G, Mannodi-Kanakkithodi A, Kim C (2017) Machine learning in materials informatics: recent applications and prospects. NPJ Comput Mater 3(1):1–13

    Article  Google Scholar 

  7. Gaultois MW, Oliynyk AO, Mar A, Sparks TD, Mulholland GJ, Meredig B (2016) Perspective: web-based machine learning models for real-time screening of thermoelectric materials properties. APL Mater 4(5):053213

    Article  CAS  Google Scholar 

  8. Xie T, Grossman JC (2018) Crystal graph convolutional neural networks for an accurate and interpretable prediction of material properties. Phys Rev Lett 120:145301

    Article  CAS  Google Scholar 

  9. Tshitoyan V, Dagdelen J, Weston L, Dunn A, Rong Z, Kononova O, Persson KA, Ceder G, Jain A (2019) Unsupervised word embeddings capture latent knowledge from materials science literature. Nature 571:95–98

    Article  CAS  Google Scholar 

  10. Schütt KT, Kessel P, Gastegger M, Nicoli KA, Tkatchenko A, Müller K-R (2019) Schnetpack: a deep learning toolbox for atomistic systems. J Chem Theory Comput 15(1):448–455

    Article  CAS  Google Scholar 

  11. Jha D, Ward L, Paul A, Liao W-K, Choudhary A, Wolverton C, Agrawal A (2018) Elemnet: deep learning the chemistry of materials from only elemental composition. Sci Rep 8(1):1–13

    Article  CAS  Google Scholar 

  12. Meredig B (2019) Five high-impact research areas in machine learning for materials science. Chem Mater 31(23):9579–9581

  13. Wagner N, Rondinelli JM (2016) Theory-guided machine learning in materials science. Front Mater 3:28

    Article  Google Scholar 

  14. Ward L, Wolverton C (2017) Atomistic calculations and materials informatics: a review. Curr Opin Solid State Mater Sci 21(3):167–176

    Article  CAS  Google Scholar 

  15. Choudhary K, DeCost B, Tavazza F (2018) Machine learning with force-field-inspired descriptors for materials: fast screening and mapping energy landscape. Phys Rev Mater 2:083801

    Article  CAS  Google Scholar 

  16. Zhou Q, Tang P, Liu S, Pan J, Yan Q, Zhang S-C (2018) Learning atoms for materials discovery. Proc Natl Acad Sci 115(28):E6411–E6417

    Article  CAS  Google Scholar 

  17. Oliynyk AO, Antono E, Sparks TD, Ghadbeigi L, Gaultois MW, Meredig B, Mar A (2016) High-throughput machine-learning-driven synthesis of full-Heusler compounds. Chem Mater 28(20):7324–7331

    Article  CAS  Google Scholar 

  18. AFLOW (2018) AFLOW–automatic-flow for materials discovery. Accessed 14 July 2019

  19. Bartel CJ, Trewartha A, Wang Q, Dunn A, Jain A, Ceder G (2020) A critical examination of compound stability predictions from machine-learned formation energies

  20. Murdock RJ, Kauwe SK (2020) Online GitHub repository for Is domain knowledge necessary for machine learning material properties. https://github.com/rynmurdock/domain_knowledge

  21. Kauwe SK, Graser J, Murdock R, Sparks TD (2020) Can machine learning find extraordinary materials? Comput Mater Sci 174:109498

    Article  Google Scholar 

  22. Wang A, Kauwe S, Murdock R, Sparks T (2020) Compositionally-restricted attention-based network for materials property prediction. https://chemrxiv.org/articles/preprint/Compositionally-Restricted_Attention-Based_Network_for_Materials_Property_Prediction/11869026

  23. Belviso F, Claerbout VEP, Comas-Vives A, Dalal NS, Fan FR, Filippetti A, Fiorentini V, Foppa L, Franchini C, Geisler B et al (2019) Viewpoint: atomic-scale design protocols toward energy, electronic, catalysis, and sensing applications. Inorg Chem 58(22):14939–14980

  24. Clement CL, Kauwe SK, Sparks TD (2020) Benchmark AFLOW data sets for machine learning. Integr Mater Manuf Innov. https://doi.org/10.1007/s40192-020-00174-4

    Article  Google Scholar 

  25. Dunn A, Wang Q, Ganose A, Dopp D, Jain A (2020) Benchmarking materials property prediction methods: the Matbench test set and automatminer reference algorithm. Accessed 5 May 2020

  26. Ward L, Dunn A, Faghaninia A, Zimmermann N, Bajaj S, Wang Q, Montoya J, Chen J, Bystrom K, Dylla M, Chard K, Asta M, Persson K, Snyder G, Foster I, Jain A (2018) Matminer: an open source toolkit for materials data mining. Comput Mater Sci 152:60–69

    Article  Google Scholar 

  27. Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. arXiv:1412.6980

Download references

Acknowledgements

The authors gratefully acknowledge support from the NSF CAREER Award DMR 1651668. The authors also thank the Berlin International Graduate School in Model and Simulation based Research as well as the German Academic Exchange Service (Program No. 57438025) for their financial support. Special thanks are given to Dr. Aleksander Gurlo for advising Anthony Yu-Tung Wang and encouraging his collaborative stay at the University of Utah. The authors thank the creators of AFLOW for the creation of the database and for making the material properties available for this study. In addition, the authors express their gratitude to the open-source software community, for developing the excellent tools used in this research, including but not limited to Python, Pandas, NumPy, matplotlib, scikit-learn, and TensorFlow.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Taylor D. Sparks.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Murdock, R.J., Kauwe, S.K., Wang, A.YT. et al. Is Domain Knowledge Necessary for Machine Learning Materials Properties?. Integr Mater Manuf Innov 9, 221–227 (2020). https://doi.org/10.1007/s40192-020-00179-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s40192-020-00179-z

Keywords

Navigation