Abstract

Aiming at the problems of the lack of abnormal instances and the lag of quality anomaly discovery in quality database, this paper proposed the method of recognizing quality anomaly from the quality control chart data by probabilistic neural network (PNN) optimized by improved genetic algorithm, which made up deficiencies of SPC control charts in practical application. Principal component analysis (PCA) reduced the dimension and extracted the feature of the original data of a control chart, which reduced the training time of PNN. PNN recognized successfully both single pattern and mixed pattern of control charts because of its simple network structure and excellent recognition effect. In order to eliminate the defect of experience value, the key parameter of PNN was optimized by the improved (SGA) single-target optimization genetic algorithm, which made PNN achieve a higher rate of recognition accuracy than PNN optimized by standard genetic algorithm. Finally, the above method was validated by a simulation experiment and proved to be the most effective method compared with traditional BP neural network, single PNN, PCA-PNN without parameters optimized, and SVM optimized by particle swarm optimization algorithm.

1. Introduction

The core of quality control lies in the full utilization and analysis of quality-related data. The goal of quality control is to identify quality problems early and take appropriate measures to solve them so that the production system can remain stable for a long time. Therefore, how to effectively organize and utilize quality data has become a problem which is widely studied and focused on by enterprises and scholars. SPC (statistical process control) is the most widely used quality control method at present. Control chart was first proposed by W. A. Shewhart of the United States, and the state of a series of distribution of points, which is drawn on the control chart according to certain processing methods, is used to determine whether the production process is in a stable and controlled state. Based on the principle of saliency statistics, the control chart divides fluctuations of the production system into system fluctuations and abnormal fluctuations and considers that system fluctuation always exists and cannot be eliminated; however, abnormal fluctuations will show certain rules on the control chart and can be eliminated by certain control means [1].

Since the control chart can monitor the running state of production system and predict possible quality problems, consequently, the identification for quality anomaly is mainly about recognizing the anomaly pattern of SPC control chart. Early anomaly pattern recognition mainly adopted the rule-based ES for anomaly patterns determination [2]. In fact, the utilization of rules in ESs based on statistical properties has the difficulty that similar statistical properties may result from different patterns, and this could create problems of incorrect recognition [3]. Besides, this recognition method of anomaly pattern by establishing the expert system based on rules needs to establish a large number of determination rules, which could cause the problem of rule combination explosion; moreover, the rate of recognition accuracy was greatly affected for the existence of random factors [4]. The ANNs are capable of learning and self-organizing and hence are widely adopted in the field of control chart pattern recognition (CCPR). Aiming at the common difficulty in existing control chart pattern recognition approaches of discrimination between different types of patterns, which share similar features, Guh et al. proposed an artificial neural network for online control chart pattern detection and discrimination [5]. Spoerre and Perry proposed the method of control chart pattern recognition using a backpropagation neural network, and the results indicated that the performance of the backpropagation neural network was very accurate in identifying control chart patterns [6]. El-Midany et al. proposed a framework for the pattern recognition of control chart in a multivariate process using artificial neural networks [7]. Wu and Wu investigated the method of wavelet neural network-based control chart patterns recognition [8]. Ebrahimzadeh et al. presented a novel hybrid intelligent method (HIM) for recognition of the common types of control chart pattern (CCP) using K-MICA clustering and neural networks [9]. Addeh et al. proposed a method of pattern recognition of control chart using RBF neural network with new training algorithm and practical features [10]. Since late 1990s, until the year of 2010, feature-based and wavelet-denoise input representation techniques have been studied for boosting the recognition performance of ANNs. The most significant works include wavelet-ANN [11], shape features-ANN [12], and statistical features-ANN [13].

Most of the previous works used raw process data as the input vector for CCPR [14]. In recent years, main research contents are concentrated on feature extraction and classification algorithm improvement. Recognizing and diagnosing abnormalities based on original data directly may lead to algorithms’ training time too long because the dimension of input data is too high. Consequently, some studies first extracted these features of control chart [15] and then identified patterns of the control chart. Zhang and Cheng integrated shape and statistical features with principal component analysis (PCA) to reduce features dimensions [16]. Gauri and Chakraborty first extracted seven features of control chart based on the CART regression tree and then used the artificial neural network ANN to recognize patterns, which achieved good results [17]. Gauri and Chakraborty investigated feature-based pattern recognition of the control chart and found that the feature-based ANN recognizers could achieve better recognition performance than the feature-based heuristic recognizers [18]. Gauri and Chakraborty studied the recognition of control chart patterns by improved selection of features [17]. Ebrahimzadeh and Ranaee researched on the method of control chart pattern recognition using an optimized neural network and efficient features, and the entropies of the wavelet packets are applied for the first time in the feature extraction module [19]. Cheng et al. studied the recognition method of control chart patterns using a neural network-based pattern recognizer with features extracted from correlation analysis, and the superior performance of the feature-based recognizer over the raw data-based one is demonstrated using synthetic pattern data [20]. Xanthopoulos and Razzaghi completed pattern recognition of control chart based on the combination of statistical feature extraction and weighted support vector machine [21]. Kao et al. proposed a multistage pattern recognition scheme of control chart based on independent component analysis and support vector machine and achieved accurate and stable recognition results [22]. Zhao et al. researched the recognition of control chart pattern using improved supervised locally linear embedding and support vector machine, which effectively eliminated redundant features in the feature set and reduced the complexity of the classification model [23]. Liu and Zhou first extracted the mean characteristics of data and then divided the anomaly patterns into three categories through a layer of ANN, and the second layer realized specific classification of anomalies through three SVM classifiers with their key parameters optimized by PSO (particle swarm optimization) algorithm [24]. Zan et al. proposed a method of control chart pattern recognition based on the convolution neural network, and the feasibility and effectiveness were verified through Monte Carlo simulation [25]. Besides, some new neural networks such as Elman network [26] and spiking neural network (SNN) [27] were also applied to the identification of anomaly patterns. As can be seen from the above review that most of the scholars adopted ANNs in the field of CCPR because of their significant advantages, however, one disadvantage with ANN is the difficulty in selecting the topology and architecture of the ANN model, and the network architecture and training parameters of ANN are quite hard to determine too. In view of this problem, Guh proposed the genetic algorithm (GA) to evolve the configuration and the training parameter set of the ANN model in solving the online CCPR problem [28]. Addeh et al. investigated the design of an accurate system for the control chart patterns (CCPs) recognition, and the cuckoo optimization algorithm (COA) was applied to find the optimal parameters of the radial basis function neural network (RBFNN) [29].

In recent years, with the successful application and in-depth investigation of many machine learning techniques (e.g., artificial neural networks (ANNs) and support vector machines (SVMs)) in the field of CCPR, there are a few scholars paying attention to the identification of concurrent CCPs (two or more basic CCPs occurring simultaneously). Yang et al. finished the identification and quantification of concurrent control chart patterns using extreme-point symmetric mode decomposition and extreme learning machines [30]. Zhang et al. investigated the recognition of mixture control chart patterns based on fusion feature reduction and fireworks algorithm-optimized MSVM [31]. However, the current study about the identification of concurrent CCPs is not far enough and needs to be studied further.

Although all the above studies provided very critical references for the sequent research work, all these methods currently have certain defects. The topology and architecture of the ANN model are very difficult to select, and the network architecture and training parameters of ANN are quite hard to determine too. SVM is a two classifier, which is not enough efficient in the problem of multiclassification identification of abnormal patterns. In addition, in most cases, there are always several kinds of abnormal patterns appearing simultaneously in actual production, and current research mostly focuses on the identification of single anomalies. Consequently, the research on CCPR needs to be further expanded.

Based on the above explanation, this paper proposes a kind of quality anomaly recognition method based on a probabilistic neural network optimized by improved genetic algorithm, which finished the recognition of mixed anomaly patterns. As can be seen from the simulation experiment, the proposed method has certain advantages in the accuracy of anomaly recognition, the speed of network training, and the ease of method.

2. Quality Abnormal Patterns and Description

In 1958, Western Electric Company first proposed the problem of control chart pattern recognition and summarized abnormal patterns of the control chart [32], and accordingly, later scholars made in-depth research on abnormal pattern of the control chart and identification problem. With the development of artificial intelligence technology, using artificial intelligence technology to recognize and determinate the control chart abnormal pattern has become one of the important research contents in the field of quality diagnosis. There are six basic modes of the control chart, as shown in Figure 1. The combination of different anomaly modes and abnormal modes reflects different states and unstable factors of production process. Starting from the abnormal mode of the control chart, it can discover hidden quality dangers in the current production process and then take corresponding solutions timely.

The production process can be expressed by the following formula:where represents the sampling moment of production data, represents the observation value of production process, is the mean of quality characteristics when production is stable, is the abnormal fluctuations caused by special factors, and different leads to different abnormal patterns in the production process. Under normal circumstances, is 0; is normal fluctuations caused by random factors, which obey normal distribution with a mean of 0 and a variance of .

Under normal circumstances, , the control chart is in a normal mode, and the above points are randomly distributed; is in the periodic mode, and and are amplitude and period, respectively; in the up (down) trend mode , in which is the slope of the trend, positive when up and negative when down; in the up (down) step mode , in which is the moment when the step mode occurs and is step amplitude; when , and otherwise 0. For the mixed anomaly mode, it can be described as the addition of abnormal fluctuation of basic anomaly mode.

3. Quality Anomaly Recognition Model Based on Optimized PNN

The model of quality anomaly recognition based on optimized probabilistic neural network (PNN) was established and is shown in Figure 2, including the three stages: feature extraction, pattern classification, and parameter optimization. In the stage of feature extraction, the dimension of original sample data is reduced to certain dimension by PCA, and 85% information of the original data is retained. The PNN is used to identify the control pattern, including basic pattern and mixed pattern recognition. The improved single-objective optimization genetic algorithm SGA is used to optimize main parameters of the PNN network, and the recognition accuracy of training samples is used as the fitness function of the SGA.

The detailed steps of the model are as follows:Step 1: the generation of sample data. The Monte Carlo method simulated and generated sample data of six basic modes and the eight mixed modes.Step 2: feature extraction. As the input of PNN, the sample data were decreased to certain dimension by PCA.Step 3: establishment of PNN network. Determine the network structure and set parameters range for subsequent pattern recognition and algorithm optimization.Step 4: optimization of parameters. The improved SGA was used to optimize parameters. Recognition accuracy of PNN is taken as the fitness function to find the optimal smoothing coefficient .Step 5: classification. The PNN network after parameter optimization realized the recognition of control pattern abnormal mode.

3.1. Feature Extraction and Dimensionality Reduction of Control Chart Based on PCA

In this paper, the feature extraction method based on PCA is a new feasible data processing method of the control graph mode in addition to the calculation of shape and statistical features and wavelet transform. PCA eliminates the complexity of calculating multiple features and also does not need to select the appropriate number of decomposition layers and wavelet levels as the wavelet transform. It is certificated by experiments that the same processing effect can be achieved.

Principal component analysis (PCA) is the most commonly used method of feature dimension reduction in data processing. It can extract the main components from complex multidimensional data and greatly improve the training speed of model without losing too much model quality [33]. The main method of PCA is to find a set of mutually orthogonal coordinate axes from the distribution space of original data. The first new coordinate axis is the direction with the largest variance in the original data, and the second new coordinate axis is the direction with the largest variance in orthogonal plane with the first coordinate axis, and so on; a series of such axes can be obtained. In these axes, the first few axes containing most of the variances are used to achieve dimensionality reduction of data features.

Figures 36 show the two-dimensional and three-dimensional scatter plots of original data and PCA processed data, respectively. It can be clearly seen from figures that data of different anomaly types are obviously distinguished after PCA principal component analysis, which is convenient for subsequent classification identification of PNN.

3.2. Anomaly Recognition Method Based on PNN

Probabilistic neural network (PNN) is one kind of radial basis neural network, and it replaces the commonly used S-shaped activation function in neural networks with exponential function to calculate the nonlinearity discrimination boundary, which is close to the Bayesian optimal decision surface [34]. PNN is a kind of nonparametric estimation method based on the Bayesian optimal classification decision theory and probability density function in statistics [35], unlike the backpropagation algorithm used by traditional neural networks, and PNN is a kind of forward propagation algorithm without feedback. The structure of a typical probabilistic neural network is shown in Figure 7.

PNN is composed of the input layer, the sample layer, the summation layer, and the output layer (the competition layer). The number of neurons in the input layer is the number of dimensions of the feature vector, which is the data dimension after PCA processing, and the input layer calculates the distance between the input vector and all the training sample vectors. The sample layer includes all the training samples, the number of neurons is the number of all training samples, and the activation function is a Gaussian function. The number of neurons in the summation layer is the number of categories, and the output of sample layer is added by classification. The output layer is also called competition layer, which has only one neuron, and the kind of output with the highest probability value is 1.

In this paper, the input layer of PNN network for anomaly recognition is the control graph data after PCA processing. The number of neurons in the input layer is determined by the result of PCA processing. The number of neurons in the sample layer is the number of training samples. The number of neurons in the summation layer is 10, which is 10 categories representing anomalies, including six basic modes and four mixed modes. Compared to the traditional BP neural network and RBF neural network, the learning process of PNN network is simple, its learning speed is fast, classification is more accurate, and its tolerance for errors and noise is also high without local minimum value problem; when the number of representative training samples increases large enough, the classifier must be able to achieve Bayesian optimality.

3.3. Optimization of PNN Network Parameters Based on Genetic Algorithm

In the PNN neural network model, the only parameter that needs to be adjusted is , which is also called the smoothing parameter. When is too small, it only acts as an isolation for individually trained samples, and it is essentially the nearest neighbor classifier; when is too large, it cannot distinguish the details completely and may not achieve the ideal classification effect for different categories with inconspicuous boundaries, and the effect is close to the linear classifier.

Genetic algorithm (GA) [36] is a computational model that simulates natural selection of Darwin’s biological evolution theory and biological evolution process of genetic mechanisms. It is a method to search optimal solutions by simulating natural evolutionary processes. Based on the principle of biological evolution, genetic algorithm generates new groups by selection, crossover, and mutation continuously according to individual fitness value in each generation group, which enables the population to evolve continuously. At the same time, the global parallel search technology is used to search and optimize optimal individuals for finding approximate optimal solution for the problems. The genetic algorithm is not limited to the continuity and differential of functions, and the result is globally optimal. Therefore, genetic algorithm is used to optimize smoothing coefficient of the PNN probabilistic neural network for finding optimal parameters. The specific process is as follows:(1)The range of smoothing factor is set, and the initial population is randomly generated, where is the scale of population and current number of generation .(2)According to the smoothing factor obtained by chromosome, the PNN network is constructed, the number of correct classifications and the accuracy rate is calculated, and the fitness function of the chromosome is calculated accordingly.(3)Winners are selected, and operations such as crossover and mutation are performed to obtain the next generation population.(4)Current generations are updated.(5)It is checked whether termination condition is met; that is, the given number of iterations is reached. If the end condition is met, the algorithm stops running; otherwise, it returns to step 2.(6)The individual with the best fitness value is used as the result of algorithm optimization and input PNN network for obtaining the final recognition model.

4. Simulation Experiment Analysis

4.1. Data Simulation and Parameter Setting

The Monte Carlo method is used to generate 1000 samples, respectively, for six basic patterns and four mixed patterns, which is a total of 10,000 samples of control pattern. 70% of samples as a training set were chosen and 30% as a test set. Each of the control chart samples includes 25 data points, which are set according to the control chart standard of the SPC manual in TS16949 series quality technology standard of American Automotive Industry Action Group (AIAG). The parameters values of each abnormal mode are shown in Table 1.

4.2. Feature Extraction and Dimensionality Reduction

Using PCA to reduce dimensionality and extract feature for original data, generally take the dimension of the former 85% of original data, which is exactly the sum of data of the former nine dimensions. Therefore, take the principal component with 85% variance contribution rates as the data after reducing dimension. Figure 8 shows the principal components with cumulative variance contribution rate of 85% and their respective proportions.

According to the results of principal component analysis (PCA), the original data in 25 dimensions are reduced to only 9 dimensions of data. The 9 principal components are used as the dimension-reduced data and input PNN network for learning training of pattern recognition, which can greatly reduce the training time of the model and increase the accuracy of recognition result.

4.3. Algorithm Parameters Optimization

Use the improved single-objective optimization genetic algorithm (SGA) instead of standard genetic algorithm flow to generate offspring by crossover and mutation at first, then the two generations of father and child are merged, and finally, select and retain elite population from merged population, which can avoid that the best individual will be destroyed by the hybridization operation.

The optimization algorithm has 100 individuals in per generation population and evolves 200 generations in total. When SGA is running, a set of smoothing parameters are randomly generated, generating the offspring by crossover and mutation, and then the parents and the progeny are combined and select individuals with high fitness to survive. The final optimization results of improved SGA are shown in Figure 9; blue curve represents the rate of average training accuracy of each generation population, and orange curve represents the optimal rate of recognition accuracy of each generation population. When the algorithm runs to the 11th generation, the rate of training accuracy of the best individual in the population reaches the highest with the optimal smoothing parameter of 1.7385 and the optimal objective function value of 95.10%, which is also the rate of training accuracy. However, the optimal smoothing parameter is 0.8161 when PNN is optimized by traditional standard genetic algorithm, and the rate of recognition accuracy of PNN is only 81.69%; the optimization results of PNN optimized by standard genetic algorithm is shown in Figure 10. Consequently, PNN optimized by the improved SGA has a higher rate of recognition accuracy than that of standard genetic algorithm.

4.4. Model Test Result

Put optimized parameters into the model and input test sample data, and then, the final recognition results are shown in Table 2.

It can be seen from the model test results that the proposed method in this paper has a higher recognition accuracy for the basic model, most recognition accuracy is above 97%, and the average recognition rate for the mixed mode is also more than 90%. However, it also found that, in the cyclic step mode, the cyclic trend step mode, and the cyclic trend mode, where the recognition accuracy is relatively low, a considerable part of modes is recognized as other modes, which indicates that the distinction of modes for trend components and steps components of control chart needs to be further improved. In the future, it may be considered to take other measures to classify and recognize these two modes, such as taking different smoothing parameters for different modes, which can make the model achieve a better recognition effect.

4.5. Comparison with Other Methods

In order to compare with the PCA-SGA-PNN method proposed in this paper, the BP, PNN, and PCA-PNN control chart recognition models are established, respectively. Among them, (1) the number of input layer neurons of BP neural network is 25 and the number of hidden layers is 3. (2) Use PNN network for pattern recognition without data processing. (3) The data after PCA dimensionality reduction is input into the PNN network for identification, and no parameter optimization is performed. (4) The PNN parameter is optimized by the improved SGA method, and then, PNN is used to identify the control chart model, which is the method proposed in this paper. (5) A recognition model is established by SVM optimized by particle swarm optimization algorithm. The comparison results of several methods are shown in Table 3.

From the analysis of the results in the table, the following conclusions can be drawn:(1)The recognition accuracy of a single BP neural network or PNN network is relatively low, and the recognition accuracy of the model with PCA analysis and dimensionality reduction is obviously improved, indicating that feature extraction is beneficial to the recognition of basic modes and mixed modes of control graphs.(2)The recognition accuracy of the model after optimizing PNN network parameters by SGA is further improved. Through parameters optimization, the blindness of relying on the empirical value is avoided, and a better recognition rate is obtained.(3)Compared with SVM optimized by particle swarm optimization, the recognition accuracy of this method proposed in this paper is higher, which indicates that PNN network has certain advantages in the pattern recognition of the control graph; it is a feasible and effective identification method.

5. Conclusions

This paper proposes the method of quality anomaly pattern recognition based on probabilistic neural network optimized by improved single-target optimization genetic algorithm.(1)In order to reduce the training time of the model, PCA (principal component analysis) was adopted in this paper and reduced the dimensionality of the original data of the control chart.(2)PNN completed the recognition of single mode and mixed mode of control chart because of its merits of simple structure and good recognition effect. The smoothing parameter of PNN was optimized by the improved (SGA) single-target optimization genetic algorithm, which eliminated the deficiency of experience value. The improved SGA got the optimal smoothing parameter of 1.7385, which made PNN achieve the optimal objective function value of 95.10%, which is also the rate of recognition accuracy. The rate of recognition accuracy was only 81.69% when compared with PNN optimized by traditional genetic algorithm. Therefore, PNN optimized by improved SGA could achieve a higher recognition accuracy than that of traditional genetic algorithm.(3)The above research content was verified by simulation experiments, and compared with the traditional BP neural network, single PNN, PCA-PNN model without parameter optimization, and the SVM model optimized by particle swarm optimization (PSO) algorithm, the PNN optimized by improved SGA achieved the best recognition effectiveness. Consequently, the method proposed in this paper has certain positive significance for promoting the application of control charts in actual production and helping enterprises to find out the quality anomalies timely in the production process.

Data Availability

The data used to support the findings of this study are available from the corresponding author on request.

Conflicts of Interest

The authors declare that there are no conflicts of interest.

Acknowledgments

The authors gratefully acknowledge the financial support by the National Key Research and Development Program of China (Grant no. 2019YFB1703800).