A batch-wise LSTM-encoder decoder network for batch process monitoring

https://doi.org/10.1016/j.cherd.2020.09.019Get rights and content

Highlights

  • Proposed a multi-layer recurrent neural network in the encoder–decoder structure and the corresponding monitoring method for batch process monitoring.

  • The proposed network can simultaneously capture the between and within batch dynamic features in the nonlinear batch process.

  • The proposed model outperforms the AutoEncoder and Sub-PCA model in fault detection.

Abstract

Process monitoring is essential to keep quality consistency and operation safety in the batch process. However, the existence of multiphase, nonlinearity and dynamic features in the batch process makes the batch process monitoring a complicated task. In this work, a multi-layer recurrent neural network in the encoder–decoder structure called batch-wise LSTM-encoder decoder network is proposed to solve the difficulties mentioned above in batch process monitoring. The LSTM-encoder extracts the nonlinear dynamic features in both between and within batch direction, then projects the high dimensional input space to a low dimensional hidden state space. The decoder part regenerates the samples from hidden states. Control statistics H2 and SPE are designed for process monitoring, and the corresponding control limits are estimated by kernel density estimation. A case study on an extensive reference penicillin fermentation dataset suggests that the proposed method can detect the fault samples more effectively than previous methods while keeping the same robustness in normal conditions.

Introduction

Batch processes play a significant role in modern industries such as pharmaceutical, semiconductor and biotechnology (Wang et al., 2019). It is defined as a type of process that repeatably produces a finite quantity of products once a time following a sequence of process steps (Mehta and Reddy, 2015). Because of the flexibility in production, it is important to keep quality consistency and operation safety in the batch process. Process monitoring, or in other words, online fault detection, is vital to meet the demand of quality and safety in the batch process, so it has become a hot field in the past decades (Qin, 2012).

Owing to the abundant sensors linking to the network in the modern batch process, statistical process control (SPC) methods have been widely applied to monitor and control the process (Yao and Gao, 2009). Univariate control chart is a feasible and popular method to monitor the process. However, it suffers from “blind spots” in a multivariate setting, which would loosen the control limits (Wang et al., 2017). In order to overcome these difficulties, many multivariate process monitoring methods have been developed. Compared to the continuous processes, the dynamic features within and between batches make the monitoring of the batch process more complex and challenging. In all the methods suitable for batch processes, multiway principal component analysis (MPCA) is the most widely used one (Nomikos and MacGregor, 1994). The multiway methods firstly unfold the three-dimensional (Batch × Time × Variable) historical batch process data cube to a two-dimensional matrix, such as batch-wise unfolding, Batch × (Time × Variable) and variable-wise unfolding, Variable × (Time × Batch). Then, the PCA method is applied to these two-dimensional matrices to project the high-dimensional data into a low-dimensional space and the statistics such as Hotelling's T2 and square prediction error(SPE) are adopted to measure the main space and residual information.

However, MPCA methods are typical linear methods which may fail to extract the nonlinear information in the batch process. In order to improve the nonlinear process monitoring efficiency, many modified methods based on MPCA are proposed in the past decades. Among them, multiphase methods (Lu et al., 2004, Zhao and Sun, 2013) and kernel methods (Lee et al., 2004, Jia et al., 2010) are two mainstream methods. Multiphase methods are similar to piecewise linearization. They first divide the batch into several phases containing relatively similar dynamic characteristics. Then local MPCA models are built for each phase separately. The ability of extracting nonlinear information is improved by reducing nonlinearity in one phase (Lu et al., 2004). Kernel PCA uses kernel functions to map data from low-dimensional input space to high-dimensional feature space, then apply PCA in the high-dimensional feature space (Lee et al., 2004).

Nowadays, artificial neural networks (ANN) are getting more attention for the ability of effectively extracting nonlinear features (Adamowski et al., 2012). Particularly, autoencoders (AEs) and recurrent neural networks(RNN) are two widely studied networks in the field of process monitoring (Yan et al., 2016, Yu and Zhao, 2019, Shahnazari, 2020, Zhang et al., 2019, Loy-Benitez et al., 2020). AEs is a type of multi-layer neural network with a low-dimensional central layer and is effective in dimension reducing and variance capturing in the nonlinear continuous processes according to (Yan et al., 2016, Yu and Zhao, 2019). In Yan et al. (2016), variant autoencoders including denoising autoencoders and contractive autoencoders are applied to the Tennessee Eastman process (TE process) and monitors the process by statistics H2. In Yu and Zhao (2019), denoising autoencoders along with elastic net is proposed to generate a sparse and robust network. Statistics A2, SPE and C are used to monitor the main hidden space, residual information, and overall measurement in the continuous processes. Considering the nature of time sequences, RNN is extensively studied for the ability of capturing sequential and dynamic information in the continuous process (Shahnazari, 2020). In Shahnazari (2020), a bank of RNNs is used to build the predictive model from the input sequence, and process monitoring is done by comparing the predictions and realistic samples. Moreover, the networks combining AEs and RNN are carried out in the field of continuous process monitoring to utilize the ability of dimension reducing and sequential information capturing (Zhang et al., 2019, Loy-Benitez et al., 2020). In Zhang et al. (2019), UNet based model, which is a variant of AEs with LSTM linking the corresponding encoder and decoder layer, is proposed to extract the nonlinear features along with time sequence information and is applied to monitor a power plant. In Loy-Benitez et al. (2020), a multi-input multi-output RNN encoder called MG-RNN-AE is designed to reduce dimension and reconstruct the original input sequence, and process monitoring is realized by monitoring the statistics H2 and SPE.

However, all the above mentioned neural networks can only extract the nonlinear feature between samples. As a characteristic of the batch process, the dynamic features not only exist between samples within a batch but also between batches. This concern indicates that some specific designs could be carried out to improve the monitoring performance in the nonlinear batch process. In Jiang et al. (2020), the three-way batch process data cube is firstly unfolded to a two-dimensional matrix. An AE model is constructed to extract the relations between variables. Finally the dynamic features between batches are captured by canonical correlation analysis (CCA) of 2-D matrices, including samples from different batches. In this way, the synthetic hidden state of AEs and CCA is more suitable to monitor the batch processes.

In order to address the above concerns, a batch-wise LSTM-encoder decoder network, along with the monitoring scheme for the nonlinear batch process, is proposed. The main contribution of the proposed method is that the LSTM-encoder can capture the between and within batch dynamic features in the nonlinear batch process simultaneously by sliding through the batch-wise input sequence. Specifically, the batch-wise LSTM-encoder is a multi-layer recurrent neural network with a shrinking layer dimension. The decoder part is a symmetric multi-layer neural network. The input sequence of LSTM-encoder is a series of kth samples from different batches. The encoder slides through these input sequences to obtain the between batch features and shrinks layer by layer to obtain the within batch features. After encoding, the last hidden state of the input sequence is passed to the decoder to reconstruct the input sample. Batch process monitoring is achieved by monitoring the proposed H2 and SPE statistics. A case study of an extensive reference data set associated with the PenSim benchmark model proposed by Van Impe and Gins (2015) shows an outstanding process-monitoring performance, especially for the barely detectable faults comparing to the AEs and MPCA monitoring model.

Section snippets

Multi-layer RNN

Recurrent neural networks (RNN) is a kind of neural network used for variable sequences (X = (x1, x2, …, xt)) modeling. Distinguished from other neural networks, RNN consists of a hidden state h and a feedback loop of hidden states to update the current hidden state, which introduces the information from previous timestamps to the current state. As shown in Fig. 1, RNN could be either a single-layer or multi-layer structure. For a J layers RNN, the current hidden state htj of tth time stamp and j

Batch data description

Measured process data in the multiphase batch process are usually stored in the form of a three-dimensional cube, XI×J×K, recording K measured points with J process variables of all the I batches, as shown in Fig. 2. It should be noted that the lengths of batches are unfixed, which is not represented in Fig. 2. Nevertheless, the unfixed batch length problem will be considered in this study, and the sample rates of all the variables are considered equal in this study. Time slice, xkIk×J is

Case study

The PenSim benchmark model (Birol et al., 2002) is chosen as the test case in this study for its popularity in the field of batch process monitoring studies (Wang et al., 2019, Wang et al., 2018). It describes the production of penicillin in a fed-batch reactor, as shown in Fig. 5. Instead of using the simulation dataset created by ourselves, a public extensive reference dataset (Van Impe and Gins, 2015) based on the PenSim benchmark model is used in this study. In this reference dataset, the

Conclusion

In this paper, a batch-wise LSTM-encoder decoder network that is suitable to extract the nonlinear dynamic features in the batch process and the corresponding process monitoring method is proposed. The LSTM-encoder is a multi-layer dimension reducing recurrent neural network. The input sequences are a series of samples from different batches. Based on the extracted hidden states and regenerated samples, the statistics H2 and SPE are designed to measure the deviations in hidden state space and

Declaration of Competing Interest

The authors report no declarations of interest.

Acknowledgments

The authors would like to acknowledge the financial support from National Science Foundation China grant no. U1609213.

References (30)

  • W. Yan et al.

    Nonlinear and robust statistical process monitoring based on variant autoencoders

    Chemometr. Intell. Lab. Syst.

    (2016)
  • Y. Yao et al.

    A survey on multistage/multiphase statistical modeling methods for batch processes

    Annu. Rev. Control

    (2009)
  • S. Yin et al.

    A comparison study of basic data-driven fault diagnosis and process monitoring methods on the benchmark Tennessee Eastman process

    J. Process Control

    (2012)
  • C. Zhao et al.

    Step-wise sequential phase partition (sspp) algorithm based statistical modeling and online process monitoring

    Chemometr. Intell. Lab. Syst.

    (2013)
  • J. Adamowski et al.

    Comparison of multiple linear and nonlinear regression, autoregressive integrated moving average, artificial neural network, and wavelet artificial neural network methods for urban water demand forecasting in Montreal, Canada

    Water Resour. Res.

    (2012)
  • Cited by (32)

    • Error-triggered spatial model predictive control for oven temperature

      2023, Chemical Engineering Research and Design
    View all citing articles on Scopus
    View full text