BP neural network modeling with sensitivity analysis on monotonicity based Spearman coefficient
Introduction
In chemical processes, accurate real-time measurements of quality variables play an important role in process control, monitoring, and optimization [1,2]. Usually, these key variables are measured through either infrequent offline analysis or expensive online measuring devices, which may lead to a long time delay. To quickly predict these variables, soft sensor technology, which can describe the mathematical relationship between the input variables and the output variable [3], has been proposed. Usually, the output variable is difficult to measure and the input variables, such as the temperature, flow rate and so on, are easy to measure [4]. Over last two decades, the soft sensor technology has gained fast-growing attention in both academia and industry area [5]. Many soft sensor methods have been proposed. According to the difference of modeling mechanism, the soft sensor methods can be classified into two groups, the first principle models and the data-driven models [6]. The former can establish an accurate description model based on process mechanism information, e.g. mass, component balances, and reaction kinetics equation. However, sometimes it is difficult to grasp or understand the actual complex industrial process and the obtained model could not meet the accuracy requirements of the process system. The latter is based on historical operation data and requires no detailed information about the system. In recent years, the data-driven modeling methods have been rapidly developed as a result of widely used distribution control system (DCS) and data acquisition system.
With the development of research, the data-driven soft sensor methods can be further divided into three categories: the latent variable modeling method [7], statistical learning method [8] and artificial intelligence method [9]. The main idea of the latent variable model is to determine the mapping weights by optimization criteria to construct latent variables, and then to establish the regression model between the input variables and the output variable. Among the latent variable modeling methods, the two most commonly used methods are principal component regression (PCR) [10] and partial least squares (PLS) method [11]. The statistical learning method is based on the minimization of the structural risk and has a more rigorous theoretical basis. The most commonly used statistical learning method is support vector regression (SVR) [12]. For the artificial intelligence method, the structure of the predicting model is more complex and has a better nonlinear approximation ability. Artificial neural network (ANN) [13] is the most popular artificial intelligence method. An ANN model has the capacity to learn and model complex nonlinear systems. Cybenko [14] pointed out that an ANN with one hidden layer can form an arbitrarily close approximation to any continuous nonlinear mapping, only assuming that the transfer function is non-constant, bounded and continuous. In the family of ANN, back-propagation neural network (BPNN) is the most popular modeling method. It has been widely used in pattern recognition [15] and process optimization [16]. Normally, a BPNN has three layers, named input layer, hidden layer and output layer. For building a BPNN with m input layer nodes, n hidden layer nodes and 1 output node, there are parameters [17], including weighs and thresholds, needed to be estimated. However, the BPNN possesses some inherent demerits caused by the generality of the data-driven method, e.g. overfitting problem [18]. In recent years, some researchers proposed different strategy to avoid the overfitting problem. Such as, Srivastava et al. [19] initialized and optimized the weight parameter of training samples to avoid the over-fitting problem. Cozad et al. [20] used regularization, which involved adding a penalty term to the loss function in order to discourage the coefficients from reaching large values, to control the over-fitting phenomenon. Daniels [21] developed a special class of monotonic neural networks and a corresponding training algorithm to overcome the over-fitting problem. To avoid the over-fitting problem, Cheng and Li [22] proposed a method to construct a BPNN model combined with monotonicity priori knowledge from experts as a constraint term. Then, the training process becomes an optimization problem with the monotonicity priori knowledge constraint. However, sometimes the prior knowledge is difficult to obtain. For example, for some complex chemical process, it is impossible for experts to analysis the mechanism of the process. In this paper, a new knowledge-acquired method is proposed to extract the monotonic information based on the process historical data without the help of experts. Furthermore, the acquired knowledge is used to build the soft sensor model as constraints to increase the extrapolating performance.
The remainder of the paper are organized as follows. Theoretical background of BPNN modeling and the Spearman coefficient is reviewed in Section 2. Section 3 proposes the method to acquire sensitive knowledge and soft sensor modeling method based on BPNN and acquired knowledge of monotonicity. A numerical experiment and an industrial case to measure the cracking severity of an ethylene furnace are used to test the effectiveness of the proposed method in Section 4 and Section 5, respectively. Finally, brief conclusions are given in Section 6.
Section snippets
A hybrid BPNN modelling method combining with priori information
BPNN is a multi-layer feedforward artificial neural network, which is trained by error back propagation algorithm proposed by Rumelhart and McCelland [23]. For a BPNN model, the training process is to optimize the weights and thresholds. If the weights and thresholds are suitable, the model fitting degree of training samples will meet the requirements, meanwhile the model has a better prediction performance. For the BPNN, the model function can be written as eq (1).
Filtering treatment for the distributed scatter plot
The distribution scatter plot can describe the nonlinear monotone relationship between covariate and response variables as introduced in section 2. However, there are some interference among the covariate variables. There are some points scattered off the diagonal line which may affect the mining of the monotone prior knowledge. This paper introduces a binary (0–1) integer programming [29] to filter the interferencing point in the distribution scatter plot.
Firstly, the distributed scatter plot
A numerical example
In this section, a numerical example is used to validate the performance of the BPNN modeling method with monotonic knowledge. The bivariate function is denoted by eq (17), and the image of eq (17) is supplied in the supplementary material.
The data of each covariate is sampled from a Gaussian distribution with the expectation of 0 and the variance of 1. In this section, a BPNN is used to describe the nonlinear process. The structure of the BPNN modeling is set at 2-10-1, which
Application to the cracking severity
Ethylene production has become a symbol which measures the development level of the petrochemical industry in a country, the cracking furnace is a key equipment in ethylene production, and plays an important role in ethylene production capacity, stable production and energy consumption. Therefore, how to control the various indicators of the ethylene cracking furnace and promote the severity of the cracking reaction has become a focus. The flow chart of the SRT-Ⅲ cracking furnace is given in
Conclusions
In this paper, the distributed scatter plot of bivariate variable based on Spearman coefficient is employed, which can be used to analyze the monotonicity sensitivity of observed data. Note that when there are strong coupling, as well as noises among the variables, 0–1 Integer linear program model is also used to filter the noises in the distributed scatter plot, and the PCA method is used to decoupling. Then the obtained number of violation is added to the BPNN model. Finally, a new EA, namely
CRediT authorship contribution statement
Yang Zhou: Conceptualization, Methodology, Software, Validation, Data curation, Writing - original draft. Shaojun Li: Writing - review & editing, Supervision, Funding acquisition.
Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgments
The authors of this paper appreciate the National Natural Science Foundation of China (under Project No. 21676086) and the Fundamental Research Funds for the Central Universities (222201917006) for their financial support.
References (40)
- et al.
Soft sensing of non-Gaussian processes using ensemble modified independent component regression
Chemometr. Intell. Lab.
(2016) - et al.
Data-driven soft sensor development based on deep learning technique
J. Process Contr.
(2014) - et al.
Prediction of thermal and mass loss behavior of mineral mixture using inferential stochastic modeling and thermal analysis measurement data
Measurement
(2017) - et al.
A hybrid just-in-time soft sensor for carbon efficiency of iron ore sintering process based on feature extraction of cross-sectional frames at discharge end
J. Process Contr.
(2017) - et al.
How to avoid over-fitting in multivariate calibration-The conventional validation approach and an alternative
Anal. Chim. Acta
(2007) - et al.
A combined first-principles and data-driven approach to model building
Comput. Chem. Eng.
(2015) - et al.
A modeling method based on artificial neural network with monotonicity knowledge as constraints
Chemometr. Intell. Lab.
(2015) - et al.
Multivariate probabilistic safety analysis of process facilities using the Copula Bayesian Network model
Comput. Chem. Eng.
(2016) - et al.
A probabilistic multivariate method for fault diagnosis of industrial processes
Chem. Eng. Res. Des.
(2015) - et al.
Alopex-based evolutionary algorithm and its application to reaction kinetic parameter estimation
Comput. Ind. Eng.
(2011)