当前位置: X-MOL 学术Comput. Stat. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Real-manufacturing-oriented big data analysis and data value evaluation with domain knowledge
Computational Statistics ( IF 1.0 ) Pub Date : 2019-09-23 , DOI: 10.1007/s00180-019-00919-6
Weichang Kong , Fei Qiao , Qidi Wu

As one of the most popular topics currently, big data has played an important role in both academic research and practical applications. However, in the manufacturing industry, it is difficult to make full use of the research results for production optimization and/or management due to the low quality of real workshop data. Typical quality problems of real workshop data include the information match degree, missing recessive data, and false error identification. The conventional data analysis methods cannot handle most such issues because these methods fail to consider professional insights into and domain knowledge about the data. The main motivation of this paper is to explore methods for analyzing and evaluating big data with domain knowledge. For this purpose, real production data from a semiconductor manufacturing workshop are adopted as the data object. First, a series of data analysis techniques with domain knowledge are developed for diagnosing the imperfections. Then, corresponding data processing techniques with domain knowledge are proposed for solving those data quality problems according to specific flaws in the data. Furthermore, this paper proposes quantitative calculation methods of data value density to determine the extent to which data quality can be improved by the proposed data processing techniques. Case studies are conducted to demonstrate that data analysis and processing techniques with domain knowledge can effectively handle data quality problems of real workshop data in terms of the information match degree, missing recessive data, and false error identification. The work in this paper has the potential to be further extended and applied to other big data applications beyond the manufacturing industry.

中文翻译:

面向制造的具有领域知识的大数据分析和数据价值评估

大数据作为当前最受欢迎的主题之一,在学术研究和实际应用中都发挥了重要作用。但是,在制造业中,由于实际车间数据的质量低,难以充分利用研究结果进行生产优化和/或管理。实际车间数据的典型质量问题包括信息匹配度,隐性数据缺失和错误识别错误。常规数据分析方法无法解决大多数此类问题,因为这些方法无法考虑对数据的专业见解和领域知识。本文的主要动机是探索利用领域知识分析和评估大数据的方法。以此目的,来自半导体制造车间的实际生产数据被用作数据对象。首先,开发了一系列具有领域知识的数据分析技术来诊断缺陷。然后,提出了相应的具有领域知识的数据处理技术,以根据数据中的特定缺陷解决那些数据质量问题。此外,本文提出了一种数据值密度的定量计算方法,以确定所提出的数据处理技术可以在多大程度上提高数据质量。案例研究表明,具有领域知识的数据分析和处理技术可以有效地处理实际车间数据的信息质量问题,包括信息匹配程度,隐性隐性数据丢失和错误识别错误。
更新日期:2019-09-23
down
wechat
bug