Abstract
As one of the most popular topics currently, big data has played an important role in both academic research and practical applications. However, in the manufacturing industry, it is difficult to make full use of the research results for production optimization and/or management due to the low quality of real workshop data. Typical quality problems of real workshop data include the information match degree, missing recessive data, and false error identification. The conventional data analysis methods cannot handle most such issues because these methods fail to consider professional insights into and domain knowledge about the data. The main motivation of this paper is to explore methods for analyzing and evaluating big data with domain knowledge. For this purpose, real production data from a semiconductor manufacturing workshop are adopted as the data object. First, a series of data analysis techniques with domain knowledge are developed for diagnosing the imperfections. Then, corresponding data processing techniques with domain knowledge are proposed for solving those data quality problems according to specific flaws in the data. Furthermore, this paper proposes quantitative calculation methods of data value density to determine the extent to which data quality can be improved by the proposed data processing techniques. Case studies are conducted to demonstrate that data analysis and processing techniques with domain knowledge can effectively handle data quality problems of real workshop data in terms of the information match degree, missing recessive data, and false error identification. The work in this paper has the potential to be further extended and applied to other big data applications beyond the manufacturing industry.
Similar content being viewed by others
References
Akter S, Wamba SF, Gunasekaran A, Dubey R, Childe SJ (2016) How to improve firm performance using big data analytics capability and business strategy alignment? Int J Prod Econ 182:113–131
Apyari VV (2017) An entropy based approach to estimation of analytical information. A hypothesis. Chemometr Intell Lab Syst 168:38–44
Arunachalam D, Kumar N, Kawalek JP (2017) Understanding big data analytics capabilities in supply chain management: unravelling the issues, challenges and implications for practice. Transp Res Part E Log Transp Rev 114:416–436
Edwards RE, New J, Parker LE, Cui B (2017) Constructing large scale surrogate models from big data and artificial intelligence. Appl Energy 202:685–699
Gandomi A, Haider M (2015) Beyond the hype: big data concepts, methods, and analytics. Int J Inf Manag 35(2):137–144
Hammer M, Somers K, Karre H, Ramsauer C (2017) Profit per hour as a target process control parameter for manufacturing systems enabled by big data analytics and industry 4.0 infrastructure. Proc CIRP 63:715–720
He Y, Zhu C, He Z, Gu C, Cui J (2017) Big data oriented root cause identification approach based on axiomatic domain mapping and weighted association rule mining for product infant failure. Comput Ind Eng 109:253–265
Jain ADS, Mehta I, Mitra J, Agrawal S (2017) Application of big data in supply chain management. Mater Today Proc 4(2A):1106–1115
Ji W, Wang L (2017) Big data analytics based fault prediction for shop floor scheduling. J Manuf Syst 43(1):187–194
Kumar A, Shankar R, Thakur LS (2017) A big data driven sustainable manufacturing framework for condition-based maintenance prediction. J Comput Sci 27:428–439
Lee J, Lapira E, Bagheri B, Kao H (2013) Recent advances and trends in predictive manufacturing systems in big data environment. Manuf Lett 1(1):38–41
Lee J, Kao HA, Yang S (2014) Service innovation and smart analytics for industry 4.0 and big data environment. Proc CIRP 16:3–8
Lee J, Ardakani HD, Yang S, Bagheri B (2015) Industrial big data analytics and cyber-physical systems for future maintenance and service innovation. Proc CIRP 38:3–7
Olmedilla M, Martínez-Torres MR, Toral SL (2016) Harvesting big data in social science: a methodological approach for collecting online user-generated content. Comput Stand Interfaces 46:79–87
Santos MY, Oliveira e Sá J, Andrade C, Lima FV, Costa E, Martinho B, Galvao J (2017) A big data system supporting bosch braga industry 4.0 strategy. Int J Inf Manag 37(6):750–760
Sattar F, Cullis-Suzuki S, Jin F (2016) Acoustic analysis of big ocean data to monitor fish sounds. Ecol Inform 34:102–107
Xu W, Liu Q, Xu W, Zhou Z, Pham DT, Lou P, Ai Q, Zhang X, Hu J (2017) Energy condition perception and big data analysis for industrial cloud robotics. Proc CIRP 61:370–375
Zhang Y, Ren S, Liu Y, Sakao T, Huisingh D (2017a) A framework for big data driven product lifecycle management. J Clean Prod 159:229–240
Zhang Y, Ren S, Liu Y, Si S (2017b) A big data analytics architecture for cleaner manufacturing and maintenance processes of complex products. J Clean Prod 142(2):626–641
Zhong RY, Huang GQ, Lan S, Dai QY, Xu C, Zhang T (2015) A big data approach for logistics trajectory discovery from RFID-enabled production data. Int J Prod Econ 165:260–272
Zhong RY, Newman ST, Huang GQ, Lan S (2016) Big data for supply chain management in the service and manufacturing sectors: challenges, opportunities, and future perspectives. Comput Ind Eng 101:572–591
Zhou K, Fu C, Yang S (2016) Big data driven smart energy management: from big data to big insights. Renew Sustain Energy Rev 56:215–225
Acknowledgements
This research was supported in part by National Natural Science Foundation of China (No. 71690234) and the Fundamental Research Funds for the Central Universities.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Kong, W., Qiao, F. & Wu, Q. Real-manufacturing-oriented big data analysis and data value evaluation with domain knowledge. Comput Stat 35, 515–538 (2020). https://doi.org/10.1007/s00180-019-00919-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00180-019-00919-6