Modeling data-driven sensor with a novel deep echo state network

doi:10.1016/j.chemolab.2020.104062

Chemometrics and Intelligent Laboratory Systems

Volume 206, 15 November 2020, 104062

https://doi.org/10.1016/j.chemolab.2020.104062 Get rights and content

Highlights

•
The proposed ADESN is capable of preserving much more input features than a traditional ESN under the same reservoir size.
•
The ADESN can realize a selective memory.
•
The ADESN is prominent in modeling long-term dependent tasks.

Abstract

Data-driven approach has been widely utilized in modeling soft sensor for predicting key quality variables in process engineering area. The soft sensor is generally a time dependent dynamical model between the input and the output. Echo state network (ESN) is a typical data-driven modeling tool, which has exhibited excellent performance in temporal data processing area. However, the memory mode in the traditional ESN lacks flexibility. It is sometimes hard to preserve sufficient input features in the states, especially for modeling long-term dependent soft sensors. To solve this problem, this paper proposes an asynchronously deep echo state network (ADESN), which is composed of a number of sub-reservoirs that are connected one by one in sequence. Additionally, time delay modules are inserted between every two adjacent layers. The ADESN scheme preserves more input history in the states. Moreover, it can realize a selective memory. The validity of the ADESN is demonstrated on modeling a number of numerical and real-life soft sensors.

Introduction

Soft sensor aims at building a model to estimate quality variables that are usually difficult to measure by hardware sensors. Nowadays, soft sensor has gained much attention in the process engineering area [[1], [2], [3], [4]]. There are two major techniques to build a soft sensor model: model-driven and data-driven [1,5]. Model-driven technique requires in-depth process knowledge about the system. Therefore, it is often impractical due to the complexity of industrial processes. In this respect, data-driven technique has gained increasing focus in modeling soft sensors, which are easy to develop and embedded in automatic control systems [1,[6], [7], [8]]. To date, various data-driven soft sensors have been applied, such as support vector machine (SVM) [6], Gaussian mixture model (GMM) [9], and artificial neural network (ANN) [[10], [11], [12]] etc. Among these methods, ANN is the most excellent model because of its appealing ability in handling nonlinearity and learning ability.

Soft sensor in process engineering area is usually considered as a time-dependent dynamic model. It means the output of soft sensor should depend significantly on the input history. A common way to realize this dynamical model by a traditional ANN is to make a finite input history available for the model by a sliding window technique and the current input is tapped from a delay line [13]. This way essentially transfers a dynamical mapping into a static one. Recurrent neural network (RNN) offers an alternative solution to the memory demands. RNN is a dynamical system with a high-dimensional internal state. When driven by the external input, the internal state preserves some information about the input history. It is therefore unnecessary to feed delayed input versions into the network. Hence, an RNN is more suitable to model the time dependent soft sensors than a feed-forward ANN in principle.

Echo state network (ESN) is a typical RNN with a fixed state transition structure (the reservoir) and an adaptable readout from the state space [14,15]. The neurons in the reservoir are sparsely randomly connected among each other or themselves. A significant advantage of the ESN approach is that only the output connection weights need to be adapted by learning. Algorithms for linear regression can compute the optimal output weights [15]. Thus, it significantly reduces the training complexity of RNN [16]. Theoretical study shows that the ESN can approximate any dynamical system with arbitrary accuracy [17]. Now, ESN and its large amounts of variants have been successfully applied in the modeling and control fields [[18], [19], [20], [21]].

To model a time dependent soft sensor by an ESN accurately, the ESN’s states should preserve sufficient input history. Otherwise the output layer cannot extract the required input features from the states. Theoretically, an ESN has an infinite memory capacity, but practically, the preserved features in the ESN’s states are faded overtime. It implies that the input features near the current time are significant, while the input features far away from the current time are too faint to be recalled. Many researchers have proposed a number of ways to make the ESN preserve more input features [[22], [23], [24], [25], [26]]. Most of them consider that the serious couplings among reservoir neurons inhabits the ESN’s memory capacity. It is difficult for a reservoir to generate rich dynamics because of the coupling, and further harder to preserve a large number of features implied in the input stream. In order to weaken the couplings among reservoir neurons, several improved ESN versions are presented such as decoupled ESN [22], grouped ESN [23], tree ESN [24], multiple layered ESN [25], and deep ESN (DESN) [26] etc. All these ESN’s improvements aim at reducing the couplings among reservoir neurons by dividing the single reservoir into a number of sub-reservoirs. Among these improved versions, the DESN is promising because of its deep topology and hierarchical processing of temporal information that has a remarkable biological plausibility as evidenced by studies in the field of neuroscience [27,28]. In DESN, the sub-reservoirs are connected one by one in sequence. Some experiments indicate that the DESN can generate more dynamics and has larger memory capacity than the traditional ESN [27]. From a psychological view, the memory in human brain is not continuously faded over time but tends to be selective [[29], [30], [31]]. When the audience accepts and deals with the contents of communication, they do not accept them all without analysis. They actively and selectively memorize those parts that are consistent with their own inherent concepts, interests and hobbies, and exclude the rest from their memories, so as to meet their own needs and achieve psychological balance. The selective memory mode makes the human brain capture easily the key information with limited resources. From a practical point of view, it is unnecessary and unrealistic to remember all happened things in chronological order. In most of real-life cases, remembering some key points about the happened things is enough to outline the whole picture of the event. This is also true for soft sensor modeling problems. In most soft sensor modeling cases, only a number of discrete inputs at different time steps are fed into the models [1,2,5,6]. Therefore, it is intelligent for the ESN to preserve selectively the required input information for a given task. To improve the performance of soft sensors, this paper proposes an asynchronously deep echo state network (ADESN). The ADESN is an extension of the DESN. Compared to the DESN, the ADESN meets the selective memory requirement for soft sensors by introducing the delayed links between every two adjacent layers. This selective memory mode is further helpful for the reservoir to store more required input features.

The remainder of this paper is organized as follows. In section 2, the principle of the ADESN is introduced; in section 3, the performance of ADESN is examined through modeling a number of numerical and real-life soft sensors; finally, some conclusions are given in section 4.

Section snippets

Topology of ADESN

The proposed ADESN is composed of three components: the input layer, the asynchronous reservoir, and the output layer (Fig. 1). The asynchronous reservoir is composed of a number of sub-reservoirs that are connected one by one in sequence, and delayed modules are inserted between every two adjacent sub-reservoirs. The i-th sub-reservoir is also called the i-th layer owing to the multilayered topology of the asynchronous reservoir. If the delayed time D of the delayed modules is set as zero,

Investing the STM in ADESN

In this section, we first investigate the memory mode in the ADESN through experimental way. The experimental results will reveal the memory modes in different ESN models visually. In Refs. [13], an estimation method about memory capacity for the traditional ESN is proposed as follows. $M C = \sum_{d = 0}^{\infty} M C_{d}$ $M C_{d} = R^{2} (u (k - d), y_{d} (k))$ where y_d(k) denotes the output of the readout unit trained to recall the input signal with a delay d, i.e. u(k−d), and R(u(k−d), y_d(k)) denotes the correlation coefficient between u(k

Conclusions

This paper proposes an ADESN approach to model the soft sensors. The performance of the ADESN is analyzed from a dynamical memory view. To model a time dependent soft sensor accurately, it is pivotal to preserve sufficient features in the states as well as to extract the required features from the state accurately with a low cost. ADESN adopts a selective memory mode to preserve the key input features and make them significant in the states by optimizing the delayed time. It is helpful to

CRediT authorship contribution statement

Ying-Chun Bo: Conceptualization, Methodology, Formal analysis, Software, Investigation. Ping Wang: Investigation, Funding acquisition. Xin Zhang: Software, Validation. Bao Liu: Investigation, Writing - original draft.

Declaration of competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgements

This paper is supported by the the National Natural Science Foundation of China (21606256), the Fundamental Research Funds for the Central Universities (20CX05006A), and the Major Scientific and Technological Projects of CNPC under Grant (ZD2019-183-007).

References (39)

B. Bidar et al.
Data-driven soft sensor approach for online quality prediction using state dependent parameter models
Chemometr. Intell. Lab. Syst.
(2017)
W. Zheng et al.
Just-in-time semi-supervised soft sensor for quality prediction in industrial rubber mixers
Chemometr. Intell. Lab. Syst.
(2018)
B. Lu et al.
Semi-supervised online soft sensor maintenance experiences in the chemical industry
J. Process Contr.
(2018)
Z. Liu et al.
Adaptive soft sensors for quality prediction under the framework of Bayesian network
Contr. Eng. Pract.
(2018)
Y. Meng et al.
Data-driven soft sensor modeling based on twin support vector regression for cane sugar crystallization
J. Food Eng.
(2019)
V. Gopakumar et al.
A deep learning-based data driven soft sensor for bioprocesses
Biochem. Eng. J.
(2018)
Gouhei Tanaka et al.
Recent advances in physical reservoir computing: a review
Neural Network.
(2019)
L. Grigoryeva et al.
Echo state networks are universal
Neural Network.
(2018)
J.P. Jordanou et al.
Nonlinear model predictive control of an oil well with echo state networks
(2018)
Y.C. Bo et al.
Online adaptive dynamic programming based on echo state networks for dissolved oxygen control
Appl. Soft Comput.
(2018)

Y. Xue et al.

Decoupled echo state networks with lateral inhibition

Neural Network.

(2007)

C. Gallicchio et al.

Tree echo state networks

Neurocomputing

(2013)

C. Gallicchio et al.

Deep reservoir computing: a critical experimental analysis

Neurocomputing

(2017)

C. Gallicchio et al.

Local lyapunov exponents of deep echo state networks

Neurocomputing

(2018)

C. O’Donnell et al.

Selective memory generalization by spatial patterning of protein synthesis

Neuron

(2014)

A.K. Barbey

Network neuroscience theory of human intelligence

Trends Cognit. Sci.

(2018)

G. Holzmann et al.

Echo state networks with filter neurons and a delay&sum readout

Neural Network.

(2010)

L. Fortuna et al.

Soft analyzers for a sulfur recovery unit

Contr. Eng. Pract.

(2003)

W.M. Shao et al.

Adaptive soft sensor for quality prediction of chemical processes based on selective ensemble of local partial least squares models

Chem. Eng. Res. Des.

(2015)

Cited by (19)

A feature-recombinant asynchronous deep reservoir computing for modeling time series data
2024, Applied Soft Computing
Modeling time series data is an important issue in many areas. Reservoir computing (RC) is a promising tool to build time series model due to its dynamic characteristic and simple training way. Asynchronous deep reservoir computing (ADRC) is an improvement version of the traditional RC. It solves time-dependent tasks more efficiently than traditional RC because of its rich dynamics and flexible short-term memory (STM). Nevertheless, it has been an open issue to design RC’s or ADRC’s reservoir topology owing to large amounts of random factors. To promote the solution of this problem, the paper proposes a feature-recombinant ADRC (FR-ADRC) for modeling time series data. In the FR-ADRC scheme, the first sub-reservoir is designed as the feature-adaptive layer, and a trainable matrix C is introduced into this layer. By learning C, the singular value (SV) distribution of the first layer could be adjusted. Further, the principal components of the reservoir topology can be extracted by the principal component analysis (PCA). Then a new temporary reservoir is constructed by recombining these extracted components. The subsequent information processing is carried out based on the recombinant reservoir, which can be adaptive to the input signals. The validity of the FR-ADRC is tested by modeling some numerical and real-life time series data. Experimental results show that the proposed approach is better than the traditional ADRC in modeling precision, generalization ability and stability.
Deep echo state networks in data marketplaces
2023, Machine Learning with Applications
Data Marketplaces are the digital platform for data buyers and data sellers to trade information as valuable products or items. The expectation taken for granted from the users of a data marketplace is the truth of the exchanged information. However, the trade of factual data also means the marketable product is no longer unique but a series of replicas. If every user within the data marketplace owns the same information, this data eventually becomes valueless. There are specific instances where the traded products are sought to be always unique, for instance predictions or digital art. This article applies Deep Echo State Networks (ESNs) in data marketplaces that map tradeable data into a larger dimensional space via the dynamics of reservoirs with fixed and non-linear properties. These reservoirs generate unique tradeable data products that cannot be replicated, therefore ensuring its exclusivity and commercial value. The validation results show that ESNs can also be applied to generate random tradeable products in different dimensional spaces. Specifically, the reservoir with its associated neural perturbation emulates a digital creator that generates unique and exclusive content based on 1D functions and 2D images.
Semi-supervised echo state network with temporal–spatial graph regularization for dynamic soft sensor modeling of industrial processes
2022, ISA Transactions
Citation Excerpt :
From these results in Fig. 11, Tables 3–5, we can obtain the following observations. For further comparison, we also conduct experiments with three latest ESN-based methods, namely asynchronously deep ESN (ADESN) [18], singular value decomposition based ESN (SVD-ESN) [19], and spatio-temporal-interpolated ESN (STI-ESN) [20], where these methods share the same reservoir parameters with the proposed TSG-SSESN model. The average RMSE values of the evaluated methods under different LSR settings are given in Table 6, where the mean value of 20 experiments are given.
Echo state network (ESN) has been successfully applied to industrial soft sensor field because of its strong nonlinear and dynamic modeling capability. Nevertheless, the traditional ESN is intrinsically a supervised learning technique, which only depends on labeled samples, but omits a large number of unlabeled samples. In order to eliminate this limitation, this work proposes a semi-supervised ESN method assisted by a temporal–spatial graph regularization (TSG-SSESN) for constructing soft sensor model with all the available samples. Firstly, the traditional supervised ESN is enhanced to construct the semi-supervised ESN (SSESN) model by integrating both unlabeled and labeled samples in the reservoir computing procedure. The SSESN computes the reservoir states under high sampling rate for better process dynamic information mining. Furthermore, the SSESN’s output optimization objective is modified by applying the local adjacency graph of all training samples as a regularization term. Especially, in view of the dynamic data characteristic, a temporal–spatial graph is constructed by considering both the temporal relationship and the spatial distances. The applications to a debutanizer column process and a wastewater treatment plant demonstrate that the TSG-SSESN model can build much smoother model and has better generalization capability than the basic ESN models in terms of soft sensor prediction results.
Prediction of superheated steam temperature for thermal power plants using a novel integrated method based on the hybrid model and attention mechanism
2022, Applied Thermal Engineering
The stability of superheated steam temperature (SST) is severely challenged by the adjustment of thermal power plants under a wide-load range. Accurate and efficient prediction of SST plays an important role in the control of superheat system. To this end, an SST prediction model based on a multi-mode integrated method is proposed in this paper. Firstly, conservation of energy, as an equality constraint, is introduced into the loss function of the data-driven model based on Long Short-Term Memory (LSTM) architecture. Subsequently, the physical relationship between SST and the spray water flow, as an inequality constraint, is introduced into the above loss function. Finally, an individual hybrid model for each operating mode is developed and integrated with multi-mode switching strategies based on attention mechanism. Operating data with a wide-load range is sampled from a supervisory information system (SIS) to validate the superiority of the proposed method. The comparison results demonstrated that the predictive effect of the hybrid model with physics-based loss function is not only generalizable but also scientifically consistent with the dynamic characteristics of the step response. Furthermore, the results of the multi-mode integrated model prove the effectiveness of the mode switching strategy based on the attention mechanism.
Developing accurate data-driven soft-sensors through integrating dynamic kernel slow feature analysis with neural networks
2021, Journal of Process Control
Citation Excerpt :
Neural networks have always been one of the most used techniques for soft-sensing [8,9] ever since they were first used for inferential estimation by Willis et al. [7]. In more recent times, neural networks have evolved and extensions such as deep learning [10,11], ensembles [12–14] and echo state networks [15–17] have been widely used for soft-sensing applications. Other machine learning algorithms that are commonly used include support vector machines [18] and extreme learning machine [19,20].
A data-driven soft-sensor modelling approach based on dynamic kernel slow feature analysis (KSFA) is proposed in this paper. Slow feature analysis is a feature extraction method that aims to extract slowly varying features that can capture the driving forces behind data. However, there are situations where linear SFA (LSFA) cannot capture the driving forces due to nonlinear relationships between the driving forces and input signals. KSFA is a nonlinear extension of LSFA that utilises the kernel trick to map the inputs into a higher-dimensional feature space. Extracting the nonlinear driving forces can improve soft-sensor performance by utilising the nonlinear slow features as inputs to a neural network, which provides information on the key underlying trends, with the added benefit of noise reduction. Combining KSFA with a neural network further improves soft-sensor performance for cases where nonlinear relationships between the driving forces and soft-sensor outputs are present. The effectiveness of the proposed method is first demonstrated on a numerical example, where the theoretical advantages of KSFA can be easily observed. It is then applied to a benchmark simulated industrial fed-batch penicillin process.
Echo-state networks for soft sensor design in an SRU process
2021, Information Sciences
Citation Excerpt :
The SRU is often used as a benchmark for SS design. Most of the results available in the literature apply to sulfur line 4 [12,5,18,15,4]. In this work, we initially consider the data acquired from line 2.
The implementation of soft sensors for industrial processes is expanding in applications for recent machine learning techniques. In this work, strategies based on reservoir computing are applied to developing dynamical models of target variables in a sulfur recovery unit (SRU) of a refinery plant in Italy. In particular, a specific type of recurrent network, namely an echo-state network (ESN), is adopted to estimate key process variables on the SRU. Two process lines are considered to evaluate the proposed algorithm on different datasets in terms of estimation performance and computational effort of the learning process. The obtained results are evaluated in comparison with other recurrent networks, based on long short-term memory, and with other techniques reported in the literature, demonstrating the feasibility of the proposed approach. Furthermore, the introduction of intrinsic plasticity (IP) is also considered to adapt the reservoir parameters to the provided inputs, achieving a significant improvement in the statistical distribution of the results obtained for the pool of learned networks. The reported results show that ESN-IP represents a suitable solution for identifying dynamical models of the industrial processes, avoiding the time-consuming regressor selection procedure, which is needed when a static network is adopted to design a dynamical model.

View all citing articles on Scopus

View full text

Modeling data-driven sensor with a novel deep echo state network

Highlights

Abstract

Introduction

Section snippets

Topology of ADESN

Investing the STM in ADESN

Conclusions

CRediT authorship contribution statement

Declaration of competing interest

Acknowledgements

Chemometr. Intell. Lab. Syst.

Chemometr. Intell. Lab. Syst.

J. Process Contr.

Contr. Eng. Pract.

J. Food Eng.

Biochem. Eng. J.

Neural Network.

Neural Network.

Appl. Soft Comput.

Neural Network.

Neurocomputing

Neurocomputing

Neurocomputing

Neuron

Trends Cognit. Sci.

Neural Network.

Contr. Eng. Pract.

Chem. Eng. Res. Des.