Abstract
Artificial intelligence (AI) is integral to Industry 4.0 and the evolution of smart factories. To realize this future, material processing industries are embarking on adopting AI technologies into their enterprise and plants; however, like all new technologies, there is always the potential for misuse or the false belief that the outcomes are reliable. The goal of this paper is to provide context for the application of machine learning to materials processing. The general landscapes of data science and materials processing are presented, using the foundry and the metal casting industry as an exemplar. The challenges that exist with typical foundry data are that the data are unbalanced, semi-supervised, heterogeneous, and limited in sample size. Data science methods to address these issues are presented and discussed. The elements of a data science project are outlined and illustrated by a case study using sand cast foundry data. Finally, a prospective view of the application of data science to materials processing and the impact this will have in the field are given.
Similar content being viewed by others
References
“Industry 4.0: the fourth industrial revolution- guide to Industrie 4.0.” https://www.i-scoop.eu/industry-4–0/. Accessed May 26, 2020.
K.-D. Thoben, S. Wiesner, T. Wuest, BIBA – Bremer Institut für Produktion und Logistik GmbH, the University of Bremen, Faculty of Production Engineering, University of Bremen, Bremen, Germany, and Industrial and Management Systems Engineering, “‘Industrie 4.0’ and Smart Manufacturing—A Review of Research Issues and Application Examples. Int. J. Autom. Technol. 11(1), 4–16 (2017). https://doi.org/10.20965/ijat.2017.p0004
Capgemini Consulting Group, Industry_4.0_-The_Capgemini_Consulting_V.pdf. Capgemini, 2014, [Online]. Available: https://www.capgemini.com/consulting/wp-content/uploads/sites/30/2017/07/capgemini-consulting-industrie-4.0_0_0.pdf.
T. Prucha, From the Editor - Big Data. Int. J. Met. 9(3), 5 (2015)
J. Friedman, R. Tibshirani, T. Hastie, The Elements of Statistical Learning: Data Mining, Inference, and Prediction (Springer, Berlin, 2001)
L. Hauser, Internet encyclopedia of philosophy, Artificial Intelligence. https://www.iep.utm.edu/art-inte/. Accessed 26 May 2020.
A. M. Turing, I.—COMPUTING MACHINERY AND INTELLIGENCE, Mind, vol. LIX, no. 236, pp. 433–460, 1950, https://doi.org/10.1093/mind/LIX.236.433.
C. Bernhardt, Turing’s Vision—The Birth of Computer Science (MIT Press, Cambridge, 2016)
J. McCarthy, M. Minsky, N. Rochester, C.E. Shannon, A Proposal for the Dartmouth Summer Research Project on Artificial Intelligence. Aug. 31, 1955, Accessed 17 Feb 17 2020. [Online]. https://wvvw.aaai.org/ojs/index.php/aimagazine/article/view/1904.
K.P. Murphy, Machine Learning: A Probabilistic Perspective (MIT Press, Cambridge, MA, 2012)
Y. Zhu, Y. Zhang, The study on some problems of support vector classifier, Comput. Eng. Appl., 2003, [Online]. Available: https://en.cnki.com.cn/Article_en/CJFDTotal-JSGG200313011.htm.
M.W. Craven, J.W. Shavlik, Using neural networks for data mining. Data Min. 13(2), 211–229 (1997). https://doi.org/10.1016/S0167-739X(97)00022-8
J.D. Rodriguez, A. Perez, J.A. Lozano, Sensitivity analysis of k-Fold cross validation in prediction error estimation. IEEE Trans. Pattern Anal. Mach. Intell. 32(3), 569–575 (2010). https://doi.org/10.1109/TPAMI.2009.187
C. Reid Turner, A. Fuggetta, L. Lavazza, A.L. Wolf, A conceptual basis for feature engineering. J. Syst. Softw. 49(1), 3–15 (1999). https://doi.org/10.1016/S0164-1212(99)00062-X
A. Zheng, A. Casari, Feature Engineering for Machine Learning: Principles and Techniques for Data Scientists (O-Reilly, Beijing, 2018)
I. Gibson, C. Amies, Data normalization techniques, 6259456, 10 Jul 2001.
Z-Transform, Wolfram MathWorld. https://mathworld.wolfram.com/Z-Transform.html. Accessed 26 May 2020.
W.M.P. van der Aalst, V. Rubin, H.M.W. Verbeek, B.F. van Dongen, E. Kindler, C.W. Günther, Process mining: a two-step approach to balance between underfitting and overfitting. Softw. Syst. Model. 9(1), 87 (2008). https://doi.org/10.1007/s10270-008-0106-z
J.K. Kittur, G.C. ManjunathPatel, M.B. Parappagoudar, Modeling of pressure die casting process: an artificial intelligence approach. Int. J. Met. 10(1), 70–87 (2016). https://doi.org/10.1007/s40962-015-0001-7
E. Kocaman, S. Şirin, D. Dispinar, Artificial neural network modeling of grain refinement performance in AlSi10Mg alloy. Int. J. Met. 20, 20 (2020). https://doi.org/10.1007/s40962-020-00472-9
P.K.D.V. Yarlagadda, E. Cheng Wei Chiang, A neural network system for the prediction of process parameters in pressure die casting. J. Mater. Process. Technol. 89–90, 583–590 (1999). https://doi.org/10.1016/S0924-0136(99)00071-0
J.K. Rai, A.M. Lajimi, P. Xirouchakis, An intelligent system for predicting HPDC process variables in interactive environment. J. Mater. Process. Technol. 203(1–3), 72–79 (2008). https://doi.org/10.1016/j.jmatprotec.2007.10.011
A. Krimpenis, P.G. Benardos, G.-C. Vosniakos, A. Koukouvitaki, Simulation-based selection of optimum pressure die-casting process parameters using neural nets and genetic algorithms. Int. J. Adv. Manuf. Technol. 27(5–6), 509–517 (2006). https://doi.org/10.1007/s00170-004-2218-0
J. Zheng, Q. Wang, P. Zhao, C. Wu, Optimization of high-pressure die-casting process parameters using artificial neural network. Int. J. Adv. Manuf. Technol. 44(7–8), 667–674 (2009). https://doi.org/10.1007/s00170-008-1886-6
D. Blondheim, Artificial intelligence, machine learning, and data analytics: understanding the concepts to find value in die casting data, presented at the 2020 NADCA Executive Conference, Clearwater Beach, FL, 25 Feb 2020.
T. Prucha, From the editor: AI needs CSI: common sense input. Int. J. Met. 12(3), 425–426 (2018). https://doi.org/10.1007/s40962-018-0235-2
R. Blagus, L. Lusa, SMOTE for high-dimensional class-imbalanced data. BMC Bioinform. 14(1), 106 (2013). https://doi.org/10.1186/1471-2105-14-106
H. Han, W.-Y. Wang, B.-H. Mao, Borderline-SMOTE: A new over-sampling method in imbalanced data sets learning, in Advances in Intelligent Computing, vol. 3644, D.-S. Huang, X.-P. Zhang, and G.-B. Huang, Eds. Berlin: Springer, 2005, pp. 878–887.
A. Creswell, T. White, V. Dumoulin, K. Arulkumaran, B. Sengupta, A.A. Bharath, Generative adversarial networks: an overview. IEEE Signal Process. Mag. 35(1), 53–65 (2018). https://doi.org/10.1109/MSP.2017.2765202
I. Goodfellow, NIPS 2016 Tutorial: Generative Adversarial Networks, ArXiv170100160 Cs, Apr. 2017. Accessed 27 May 2020. [Online]. Available: https://arxiv.org/abs/1701.00160.
F. Pedregosa et al., Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011). https://doi.org/10.1016/j.patcog.2011.04.006
A. Geron, Hands-On Machine Learning with Scikit-Learn and Tensor Flow, 1st edn. (O’Reilly, Beijing, 2017)
J.D. Hunter, Matplotlib: a 2D graphics environment. Comput. Sci. Eng. 9(3), 90–95 (2007). https://doi.org/10.1109/MCSE.2007.55
The pandas development team, pandas-dev/pandas: Pandas. Zenodo, 2020.
W. McKinney, Data structures for statistical computing in Python, in Proceedings of the 9th Python in Science Conference, 2010, pp. 51–56, Accessed 09 Jan 2020. [Online]. Available: https://conference.scipy.org/proceedings/scipy2010/pdfs/mckinney.pdf.
G. Van Rossum, F.L. Drake, Python 3 Reference Manual (CreateSpace, Scotts Valley, CA, 2009)
T.E. Oliphant, Python for scientific computing. Comput. Sci. Eng. 9(3), 10–20 (2007). https://doi.org/10.1109/MCSE.2007.58
T. Wuest, D. Weimer, C. Irgens, K.-D. Thoben, Machine learning in manufacturing: advantages, challenges, and applications. Prod. Manuf. Res. 4(1), 23–45 (2016). https://doi.org/10.1080/21693277.2016.1192517
C. Eckart, G. Young, The approximation of one matrix by another of lower rank. Psychometrika 1(3), 211–218 (1936). https://doi.org/10.1007/BF02288367
H. Abdi, L.J. Williams, Principal component analysis: principal component analysis. Wiley Interdiscip. Rev. Comput. Stat. 2(4), 433–459 (2010). https://doi.org/10.1002/wics.101
S. Wold, K. Esbensen, P. Geladi, Principal component analysis. Chemom. Intell. Lab. Syst. 2, 37–52 (1987). https://doi.org/10.1016/0169-7439(87)80084-9
M. Pal, Random forest classifier for remote sensing classification. Int. J. Remote Sens. 26(1), 217–222 (2005). https://doi.org/10.1080/01431160412331269698
R.E. Wright, Logistic regression, in Reading and Understanding Multivariate Statistics, Washington, DC, US: American Psychological Association, 1995, pp. 217–244.
D. Dietrich, B. Heller, B. Yang, Data Science and Big Data Analytics: Discovering, Analyzing, Visualizing and Presenting Data, 1st edn. (Wiley, Hoboken, 2015)
A. Altmann, L. Toloşi, O. Sander, T. Lengauer, Permutation importance: a corrected feature importance measure. Bioinformatics 26(10), 1340–1347 (2010). https://doi.org/10.1093/bioinformatics/btq134
T. Chen, C. Guestrin, XGBoost: a scalable tree boosting system. in Proc. 22nd ACM SIGKDD Int. Conf. Knowl. Discov. Data Min., pp. 785–794, Aug. 2016, https://doi.org/10.1145/2939672.2939785.
National Research Council, Frontiers in Massive Data Analysis (National Academies Press, Washington, D.C., 2013)
C.P. Snow, The Two Cultures (Cambridge University Press, London, 1959)
Acknowledgements
The authors would like to thank the ACRC consortium members for their support and data for this project.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Sun, N., Kopper, A., Karkare, R. et al. Machine Learning Pathway for Harnessing Knowledge and Data in Material Processing. Inter Metalcast 15, 398–410 (2021). https://doi.org/10.1007/s40962-020-00506-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s40962-020-00506-2