BP neural network modeling with sensitivity analysis on monotonicity based Spearman coefficient

https://doi.org/10.1016/j.chemolab.2020.103977Get rights and content

Highlights

  • The proposed method analysis the sensitivity on monotonicity based on copula coefficient, and applies it on neural network modeling.

  • The scatter plot and the Spearman coefficient are used to extract the monotonic information.

  • 0-1 integer linear program model is also applied to filter the scatter diagram.

Abstract

This paper proposes a new monotonicity extraction method which is used to constrain the modeling process of a neural network. The main contributions of this paper are the sensitivity analysis on monotonicity based on Spearman coefficient, and the application of monotonicity on neural network modeling. This study uses scatter plots of bivariate variables and the Spearman coefficient to extract the monotonic information. To weaken the influence of noise, binary 0–1 integer linear program is applied to filter the scatter diagram. Based on the monotonicity information, a constraint optimization problem is proposed to obtain the BP neural network model and an Alopex-based evolutionary algorithm (AEA) is used to search for the optimal weights and thresholds. The results of a numeral example and an ethylene cracking furnace show that the proposed approach can achieve a good predicting performance in the two cases.

Introduction

In chemical processes, accurate real-time measurements of quality variables play an important role in process control, monitoring, and optimization [1,2]. Usually, these key variables are measured through either infrequent offline analysis or expensive online measuring devices, which may lead to a long time delay. To quickly predict these variables, soft sensor technology, which can describe the mathematical relationship between the input variables and the output variable [3], has been proposed. Usually, the output variable is difficult to measure and the input variables, such as the temperature, flow rate and so on, are easy to measure [4]. Over last two decades, the soft sensor technology has gained fast-growing attention in both academia and industry area [5]. Many soft sensor methods have been proposed. According to the difference of modeling mechanism, the soft sensor methods can be classified into two groups, the first principle models and the data-driven models [6]. The former can establish an accurate description model based on process mechanism information, e.g. mass, component balances, and reaction kinetics equation. However, sometimes it is difficult to grasp or understand the actual complex industrial process and the obtained model could not meet the accuracy requirements of the process system. The latter is based on historical operation data and requires no detailed information about the system. In recent years, the data-driven modeling methods have been rapidly developed as a result of widely used distribution control system (DCS) and data acquisition system.

With the development of research, the data-driven soft sensor methods can be further divided into three categories: the latent variable modeling method [7], statistical learning method [8] and artificial intelligence method [9]. The main idea of the latent variable model is to determine the mapping weights by optimization criteria to construct latent variables, and then to establish the regression model between the input variables and the output variable. Among the latent variable modeling methods, the two most commonly used methods are principal component regression (PCR) [10] and partial least squares (PLS) method [11]. The statistical learning method is based on the minimization of the structural risk and has a more rigorous theoretical basis. The most commonly used statistical learning method is support vector regression (SVR) [12]. For the artificial intelligence method, the structure of the predicting model is more complex and has a better nonlinear approximation ability. Artificial neural network (ANN) [13] is the most popular artificial intelligence method. An ANN model has the capacity to learn and model complex nonlinear systems. Cybenko [14] pointed out that an ANN with one hidden layer can form an arbitrarily close approximation to any continuous nonlinear mapping, only assuming that the transfer function is non-constant, bounded and continuous. In the family of ANN, back-propagation neural network (BPNN) is the most popular modeling method. It has been widely used in pattern recognition [15] and process optimization [16]. Normally, a BPNN has three layers, named input layer, hidden layer and output layer. For building a BPNN with m input layer nodes, n hidden layer nodes and 1 output node, there are (m+2)n+1 parameters [17], including weighs and thresholds, needed to be estimated. However, the BPNN possesses some inherent demerits caused by the generality of the data-driven method, e.g. overfitting problem [18]. In recent years, some researchers proposed different strategy to avoid the overfitting problem. Such as, Srivastava et al. [19] initialized and optimized the weight parameter of training samples to avoid the over-fitting problem. Cozad et al. [20] used regularization, which involved adding a penalty term to the loss function in order to discourage the coefficients from reaching large values, to control the over-fitting phenomenon. Daniels [21] developed a special class of monotonic neural networks and a corresponding training algorithm to overcome the over-fitting problem. To avoid the over-fitting problem, Cheng and Li [22] proposed a method to construct a BPNN model combined with monotonicity priori knowledge from experts as a constraint term. Then, the training process becomes an optimization problem with the monotonicity priori knowledge constraint. However, sometimes the prior knowledge is difficult to obtain. For example, for some complex chemical process, it is impossible for experts to analysis the mechanism of the process. In this paper, a new knowledge-acquired method is proposed to extract the monotonic information based on the process historical data without the help of experts. Furthermore, the acquired knowledge is used to build the soft sensor model as constraints to increase the extrapolating performance.

The remainder of the paper are organized as follows. Theoretical background of BPNN modeling and the Spearman coefficient is reviewed in Section 2. Section 3 proposes the method to acquire sensitive knowledge and soft sensor modeling method based on BPNN and acquired knowledge of monotonicity. A numerical experiment and an industrial case to measure the cracking severity of an ethylene furnace are used to test the effectiveness of the proposed method in Section 4 and Section 5, respectively. Finally, brief conclusions are given in Section 6.

Section snippets

A hybrid BPNN modelling method combining with priori information

BPNN is a multi-layer feedforward artificial neural network, which is trained by error back propagation algorithm proposed by Rumelhart and McCelland [23]. For a BPNN model, the training process is to optimize the weights and thresholds. If the weights and thresholds are suitable, the model fitting degree of training samples will meet the requirements, meanwhile the model has a better prediction performance. For the BPNN, the model function can be written as eq (1).{hi(x)=f1(wihx+bih)y ​= ​f2(

Filtering treatment for the distributed scatter plot

The distribution scatter plot can describe the nonlinear monotone relationship between covariate and response variables as introduced in section 2. However, there are some interference among the covariate variables. There are some points scattered off the diagonal line which may affect the mining of the monotone prior knowledge. This paper introduces a binary (0–1) integer programming [29] to filter the interferencing point in the distribution scatter plot.

Firstly, the distributed scatter plot

A numerical example

In this section, a numerical example is used to validate the performance of the BPNN modeling method with monotonic knowledge. The bivariate function is denoted by eq (17), and the image of eq (17) is supplied in the supplementary material.y=sin(x1)×cos(2x2)

The data of each covariate is sampled from a Gaussian distribution with the expectation of 0 and the variance of 1. In this section, a BPNN is used to describe the nonlinear process. The structure of the BPNN modeling is set at 2-10-1, which

Application to the cracking severity

Ethylene production has become a symbol which measures the development level of the petrochemical industry in a country, the cracking furnace is a key equipment in ethylene production, and plays an important role in ethylene production capacity, stable production and energy consumption. Therefore, how to control the various indicators of the ethylene cracking furnace and promote the severity of the cracking reaction has become a focus. The flow chart of the SRT-Ⅲ cracking furnace is given in

Conclusions

In this paper, the distributed scatter plot of bivariate variable based on Spearman coefficient is employed, which can be used to analyze the monotonicity sensitivity of observed data. Note that when there are strong coupling, as well as noises among the variables, 0–1 Integer linear program model is also used to filter the noises in the distributed scatter plot, and the PCA method is used to decoupling. Then the obtained number of violation is added to the BPNN model. Finally, a new EA, namely

CRediT authorship contribution statement

Yang Zhou: Conceptualization, Methodology, Software, Validation, Data curation, Writing - original draft. Shaojun Li: Writing - review & editing, Supervision, Funding acquisition.

Declaration of competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

The authors of this paper appreciate the National Natural Science Foundation of China (under Project No. 21676086) and the Fundamental Research Funds for the Central Universities (222201917006) for their financial support.

References (40)

  • K. Deep et al.

    Quadratic approximation based hybrid genetic algorithm for function optimization

    Appl. Math. Comput.

    (2008)
  • Z. Wang et al.

    A new constraint handling method based on the modified Alopex-based evolutionary algorithm

    Comput. Ind. Eng.

    (2014)
  • Z. Ge

    Quality prediction and analysis for large-scale processes based on multi-level principal component modeling strategy

    Contr. Eng. Pract.

    (2014)
  • K.M. van Geem et al.

    Molecular reconstruction of naphtha steam cracking feedstocks based on commercial indices

    Comput. Chem. Eng.

    (2007)
  • Z. Ge

    Process data analytics via probabilistic latent variable models: a tutorial review

    Ind. Eng. Chem. Res.

    (2018)
  • Y. Zhou et al.

    Enhancing quality of multivariate process monitoring based on vine copula and active learning strategy

    Ind. Eng. Chem. Res.

    (2018)
  • Y. Zhou et al.

    Improved vine copula-based dependence description for multivariate process monitoring based on ensemble learning

    Ind. Eng. Chem. Res.

    (2019)
  • X. Peng et al.

    An online performance monitoring and modeling paradigm based on just-in-time learning and extreme learning machine for non-Gaussian chemical process

    Ind. Eng. Chem. Res.

    (2017)
  • Z. Ge

    Process data analytics via probabilistic latent variable models: a tutorial review

    Ind. Eng. Chem. Res.

    (2018)
  • L. Yao et al.

    Deep learning of semi-supervised process data with hierarchical extreme learning machine and soft sensor application

    IEEE Trans. Ind. Electron.

    (2018)
  • Cited by (0)

    View full text