Automated search of process control limits for fault detection in time series data

doi:10.1016/j.jprocont.2022.07.002

Journal of Process Control

Volume 117, September 2022, Pages 52-64

https://doi.org/10.1016/j.jprocont.2022.07.002 Get rights and content

Highlights

•
Summary of existing time series classification methods and their practical feasibility for productive deployment.
•
Presentation of the algorithm including a brief introduction into the theoretical fundamentals of the underlying statistical techniques.
•
Experimental evaluation and validation of the algorithm on public datasets and real-world manufacturing data.
•
Publication of the source code to facilitate adoption by practitioners and researchers.

Abstract

Manually defined control limits remain a common strategy for quality control in manufacturing due to their ease of deployment on the shop floor compared to more advanced data analysis approaches. Despite their continued importance, there is no systematic method of defining these control limits. However, sub-optimal control limits can lead to undetected faults or cause unnecessary interruption to production. This manuscript presents an algorithm that systematizes this manual process into an efficient search task. We conceptualized the search task as a sequence of sub-problems that are based on the conventional steps taken by process experts when defining control limits. This algorithm can be integrated into an expert tool for shop floor personnel to automate the definition of control limits in annotated time series data. We demonstrate the efficacy of the control limits found by our algorithm by comparing them to those manually defined by process experts in real-world process data from the automotive industry. Furthermore, we show that our algorithm generalizes to traditional time series classification problems and achieves state-of-the-art performance on selected benchmark datasets. Our work is the first effort in automating the otherwise manual definition of control limits for fault detection.

Introduction

Control limits are a common process monitoring approach in the manufacturing domain to preemptively detect quality faults. Specifically, control limits raise a warning if the process deviates from a predefined normal state. Often, statistical control limits are used as part of a broader methodology known as statistical process control (SPC) [1], [2]. Statistical control limits are derived from the variation of the process and used for basic quality control. A process is considered under control if it lies within these control limits and exhibits minimal variation over time. Instead, values laying outside statistical control limits indicate the presence of an uncontrolled influence on the process. Aside from statistical process control limits, process experts may manually define additional control limits (i.e., empirical control limits) to detect the occurrence of a specific fault. This is especially relevant for the monitoring of time series process data, as it is difficult to generalize statistical process control to time series data [3], [4], [5], [6]. A real-world example is depicted in Fig. 1, where empirical control limits are used to detect a verified manufacturing fault in an otherwise under-control tightening process.

Effectively, process experts use empirical control limits to perform a one-vs-all classification of time series data. This is not ideal, as empirical control limits merely detect if a process behaves differently to a defined (normal) state A. Instead, the objective should be to be warned if a process behaves similar to a particular abnormal state B. There is a wide range of classification algorithms to classify time series directly, ranging from statistical modeling to machine learning and deep learning [7]. Researchers have empirically demonstrated the applicability of these methods for numerous real-world applications [8]. Nevertheless, empirical control limits remain widespread in practice for fault detection [9]. This is largely motivated by their ease of implementation and deployment. Training and validating a classification model is often an iterative process in practice, that requires both time and expertise on the part of the model developer. In contrast, empirical control limits are easily implemented in existing process monitoring and control systems through a familiar human–machine interface. Thus, they can be deployed as an immediate countermeasure in response to a quality issue while remedial action is underway. Furthermore, the direct deployment of a model to production machines is not always possible due to security concerns and hardware constraints [10], [11]. A real-time capable model-based warning system often requires additional infrastructure. In contrast, the infringement of empirical control limits raises a real-time warning without requiring any additional investment.

Empirical control limits continue to be highly relevant as a standard tool for fault detection. They enable a fast and effective response from shop floor personnel. The high practical relevance would suggest that established tools exist to automatically find empirical control limits for a given defect. At the very least, one would expect the manual process of searching for these control limits to follow a systematic methodology. However, there are no empirical studies or systematic research in this area. This is remarkable, considering the drawbacks of sub-optimal empirical control limits. Conservative control limits can cause several processes to be incorrectly classified as faulty (i.e., false positives). Such misclassification results in unnecessary rework or scrap and could severely degrade machine productivity. On the other hand, lax control limits can fail to detect several faults (i.e., false negatives).

Setting empirical control limits is a delicate balancing act. Process experts often rely on feedback from the shop floor to adjust the control limits if either the false positive or false negative rate is unacceptably high. Also, they may conduct experiments to replicate the underlying fault and collect additional data to set more effective control limits. An algorithm that uses all available data to find empirical control limits can substantially reduce the associated effort, shorten the feedback cycle or even avoid it all together.

The key contribution of the work presented in this manuscript is the automation of the manual process of defining control limits in time series data. To the best of our knowledge, there is no dedicated literature concerned with automating this task. The algorithm is comprised of a sequence of steps that are loosely inspired by the steps performed by process experts when manually defining such control limits. The manual process is broken down into sub-problems. The algorithm uses statistical techniques to solve these sub-problems individually — the final control limits are an aggregate result of these solutions. This top-down approach results in an efficient search algorithm that can be easily automated. The user can adjust the minimum precision to return control limits that are more suitable for a particular fault detection strategy. The algorithm is validated using real-world process data from the automotive industry as well as selected benchmark datasets. The former shows its applicability for the intended task i.e. fault detection in the manufacturing domain while the latter serves to illustrate its generalizability. For the real-world data, the control limits found by the algorithm show slightly better performance than those manually defined by process experts. For the benchmark data, the algorithm matches state-of-the-art performance.

To summarize, this paper makes the following contributions to the current state-of-the-art:

Contribution 1.
Analysis and systematization of the steps performed by process experts when manually defining control limits in practice.
Contribution 2.
Presentation a novel algorithm that automates the search for control limits for fault detection in time series data.
Contribution 3.
Validation of the algorithm using quality faults in real-world process data from the manufacturing domain. We show that control limits found by the algorithm are at least as effective as those manually defined by process experts.
Contribution 4.
We demonstrate the scalability of the algorithm for traditional time series classification by testing it on selected benchmark datasets.

The rest of the paper is structured as follows: in Section 2, we introduce the concept of control limits for fault detection and provide an overview of applicable approaches. We present the theoretical background for the definition of control limits in Section 3. In Section 4, we describe the implementation of the proposed algorithm. In Section 5, we contrast its performance with manually defined control limits for real-world data. Additionally, we evaluate its performance for traditional time-series classification. We discuss the applicability and limitations of the algorithm in Sections 6 Discussion, 7 Conclusion.

Section snippets

Fault detection

In this section, we motivate why finding empirical control limits for fault detection is an important contribution to the-state-of-the-art in process and quality control.

Theoretical background

In this section, we discuss the empirical observations that motivated the design considerations of the proposed algorithm and provide a brief theoretical background regarding the underlying statistical techniques used for its implementation.

Proposed method

In this section, we describe the implementation of the statistical techniques introduced in the previous section and how they are merged into a single algorithm. A flowchart of the algorithm is depicted in Fig. 4.

Experiments

In this section, we provide an overview of the datasets, evaluation metrics and results of our experiments. Source code and datasets to reproduce our results are publicly available [42].

Discussion

In this section, we discuss the practical applicability of the proposed algorithm. Concretely, we are looking to answer questions that may arise regarding its application, limitations as well as other factors that should be considered when interpreting the results.

Conclusion

In this paper, we propose an algorithm that uses statistical techniques to automate a systematic definition of control limits for fault detection.

Control limits have been standard practice for quality control in the manufacturing domain for decades. Despite the feasibility of more advanced time series classification algorithms, they remain widespread in practice. The main reason for their steady popularity are the speed and ease with which they can be implemented by shop floor personnel. While

CRediT authorship contribution statement

Thomas Schlegl: Conceptualization, Methodology, Resources, Data curation, Writing – original draft, Writing – review & editing, Visualization, Project administration. Domenico Tomaselli: Software, Methodology, Validation, Formal analysis, Investigation, Writing – original draft, Writing – review & editing, Visualization. Stefan Schlegl: Conceptualization, Methodology. Nikolai West: Review & editing. Jochen Deuse: Supervision.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References (49)

HeQ.P. et al.
Statistical process monitoring as a big data analytics tool for smart manufacturing
J. Process Control
(2018)
BoelsL. et al.
Conceptual difficulties when interpreting histograms: A review
Educ. Res. Rev.
(2019)
WoodallW. et al.
Research issues and ideas in statistical process control
J. Qual. Technol.
(1999)
StoneR. et al.
Time series models in statistical process control: Considerations of applicability
J. R. Statist. Soc.
(1995)
AlwanL. et al.
Time-series modeling for statistical process control
J. Bus. Econ. Statist.
(1988)
KnothS. et al.
Control charts for time series: A review
KramerH. et al.
Control charts for time series
Nonlinear Anal. Theory Methods Appl.
(1997)
Ismail FawazH. et al.
Deep learning for time series classification: a review
Data Min. Knowl. Discov.
(2019)
BagnallA. et al.
The great time series classification bake off: a review and experimental evaluation of recent algorithmic advances
Data Min. Knowl. Discov.
(2017)
IqbalR. et al.
Fault detection and isolation in industrial processes using deep learning approaches
IEEE Trans. Ind. Inf.
(2019)

L. Baier, F. Jöhren, S. Seebacher, Challenges in the deployment and operation of machine learning in practice, in:...

PaleyesA. et al.

Challenges in deploying machine learning: a survey of case studies

CookeT.A. et al.

Dynamic statistical process control limits for power quality trend data

MueenA. et al.

Time series join on subsequence correlation

BerndtD. et al.

Using dynamic time warping to find patterns in time series

C. Ratanamahatana, E. Keogh, Three Myths about Dynamic Time Warping Data Mining, in: Proceedings of the 2005 SIAM...

DauA. et al.

Judicious setting of dynamic time warping’s window width allows more accurate classification of time series

DauA. et al.

Optimizing dynamic time warping’s window width for time series data mining applications

Data Min. Knowl. Discov.

(2018)

KeoghE.

Exact indexing of dynamic time warping

Knowl. Inf. Syst.

(2005)

RabinerL. et al.

Fundamentals of Speech Recognition

(1993)

ChristM. et al.

Distributed and parallel time series feature extraction for industrial big data applications

Neurocomputing

(2017)

ZhangH. et al.

Feature extraction for time series classification using discriminating wavelet coefficients

MaggipintoM. et al.

A deep learning-based approach to anomaly detection with 2-dimensional data in manufacturing

Cited by (0)

View full text

Automated search of process control limits for fault detection in time series data

Highlights

Abstract

Introduction

Section snippets

Fault detection

Theoretical background

Proposed method

Experiments

Discussion

Conclusion

CRediT authorship contribution statement

Declaration of Competing Interest

J. Process Control

Educ. Res. Rev.

Research issues and ideas in statistical process control

J. Qual. Technol.

Time series models in statistical process control: Considerations of applicability

J. R. Statist. Soc.

Time-series modeling for statistical process control

J. Bus. Econ. Statist.

Control charts for time series: A review

Control charts for time series

Nonlinear Anal. Theory Methods Appl.

Deep learning for time series classification: a review

Data Min. Knowl. Discov.

The great time series classification bake off: a review and experimental evaluation of recent algorithmic advances

Data Min. Knowl. Discov.

Fault detection and isolation in industrial processes using deep learning approaches

IEEE Trans. Ind. Inf.

Challenges in deploying machine learning: a survey of case studies

Dynamic statistical process control limits for power quality trend data

Time series join on subsequence correlation

Using dynamic time warping to find patterns in time series

Judicious setting of dynamic time warping’s window width allows more accurate classification of time series

Optimizing dynamic time warping’s window width for time series data mining applications

Data Min. Knowl. Discov.

Exact indexing of dynamic time warping

Knowl. Inf. Syst.

Fundamentals of Speech Recognition

Distributed and parallel time series feature extraction for industrial big data applications

Neurocomputing

Feature extraction for time series classification using discriminating wavelet coefficients

A deep learning-based approach to anomaly detection with 2-dimensional data in manufacturing