Elsevier

Chemical Engineering Science

Volume 279, 5 September 2023, 118958
Chemical Engineering Science

Soft sensor modeling for small data scenarios based on data enhancement and selective ensemble

https://doi.org/10.1016/j.ces.2023.118958Get rights and content

Highlights

  • A new generative model SV-WGANgp is proposed for generating virtual labeled samples.

  • SV-WGANgp combines SVAE and WGANgp to improve the quality of the generated samples.

  • Multi-objective optimization is used to achieve ensemble pruning of the enhanced base models.

  • A soft sensor framework DESE based on data enhancement and selective ensemble for small data scenarios.

  • SV-WGANgp and DESE are verified on two chemical processes.

Abstract

Data-driven soft sensors have been widely used to facilitate real-time estimations of difficult-to-measure variables. However, sufficient high-quality data are often difficult to obtain due to high cost or low sampling rate. Thus, a soft sensor method based on data enhancement and selective ensemble (DESE) is proposed. First, a generative model is proposed for generating virtual labeled samples by combining supervised variational autoencoder (SVAE) and Wasserstein GAN with gradient penalty (WGAN-gp), which is referred to as SV-WGANgp. Then, SV-WGANgp is trained on various resampled training subsets to generate different sets of virtual samples. Next, diverse enhanced base models are constructed with the extended training sets. Subsequently, a multi-objective optimization (MOO) is utilized to achieve ensemble pruning. Finally, the final prediction is obtained by fusing the selected models. The application results on an industrial chlortetracycline fermentation process and a simulated penicillin fermentation process verify the effectiveness and superiority of the proposed methods.

Introduction

Process industry is an important pillar of national economic and social development, which provides raw materials and energy for downstream manufacturing (Yuan et al., 2018, Zhu and Goldberg, 2009). In order to achieve smart and optimal manufacturing in the process industry, it is appealing to implement advanced process monitoring, control and optimization technologies, which highly rely on real-time measurements of key product quality. However, many key quality variables cannot be measured in real time due to the lack of online measurement instruments, or difficulties and high cost of measurements in harsh environment (Yuan et al., 2020). As a promising indirect measurement method, soft sensor can realize the estimation of difficult-to-measure variables by constructing a mathematical model between secondary variables and primary variable. This method can effectively overcome the problems of offline analysis delay and large sampling interval, which allows estimating those quality variables that are highly relevant to product efficiency (Fortuna et al., 2007). Over the past decades, soft sensor technology has been widely used in chemical, biological, metallurgical industries, etc., because of its advantages of low cost, strong anti-interference ability, easy maintenance and high real-time performance (Cardoso et al., 2022, Haimi et al., 2013, Liu et al., 2018).

Soft sensor modeling approaches are generally divided into mechanistic and data-driven modeling. The former methods require grasping in-depth internal biochemical reaction mechanisms of production processes, and aim to establish accurate mathematical models. However, it is difficult and even impossible to build accurate mathematical models because industrial processes often encounter many problems such as strong nonlinearity, high complexity and uncertainty. In contrast, data-driven modeling technology only needs to mine effective information from historical data for catching the mathematical relationship between secondary variables and primary variable (Shi et al., 2020). Benefiting from the rapid development of computational intelligence and process data analytics technology, data-driven soft sensor modeling methods have attracted growing attentions (Jiang et al., 2020, Zeng and Ge, 2021, Jin et al., 2019a). The commonly used data-driven soft sensor models mainly include principal component regression (PCR), partial least squares (PLS), support vector regression (SVR), Gaussian process regression (GPR), artificial neural networks (ANN), deep neural networks (DNN), etc (Dai et al., 2022, Fezai et al., 2020, Liu et al., 2022, Qi et al., 2022, Sarstedt et al., 2022, Yu et al., 2022).

With the rapid development of information and communication technology, we have ushered in the cloud era, where massive amounts of data are flooding into all walks of life. Unfortunately, the small sample problems still exist and remain challenging to construct well-performing data-driven soft sensor models. The reasons for this situation mainly come from two aspects. On the one hand, the samples obtained from the stable production process are usually similar, which make it difficult to extract sufficient and valuable information (Li et al., 2014). On the other hand, due to the small probability of data occurrence, and the difficulty and high cost of data acquisition, the process data available for modeling and analysis are often limited (Chai et al., 2022).

The methods proposed for addressing small sample problems can been roughly divided into three types: grey model based methods, feature extraction based methods and sample generation based data enhancement methods (Tian et al., 2021). Among them, grey model (Ma and Liu, 2018) is abstracted from the grey system, which contains both known information and unknown or uncertain information. It is an effective tool to deal with small sample problems by making full use of a small amount of incomplete information to establish mathematical models and make predictions. However, gray model based methods have a strong dependence on historical data, and they are only suitable for short and medium-term prediction. In addition, since the correlation between factors within the system is ignored, these methods are prone to produce large prediction deviations. For high-dimensional small data in industrial processes, the latent variable information can be extracted for data dimensionality reduction (He et al., 2022, Huang and Versaci, 2021). Specifically, feature extraction based methods apply interpolation algorithm to obtain virtual inputs on extracted data features, thereby realizing the expansion of small samples. Nevertheless, some important information may be lost after feature extraction, which ultimately restricts the improvement of model performance. Alternatively, sample generation based data enhancement methods have been proposed recently to generate completely different samples by learning the features or distributions of original samples. Some recent studies have shown that expanding the generated samples to the original sample set can enhance the soft sensor performance (Chen et al., 2017, Zhang et al., 2021, Zhang et al., 2022, Zhu et al., 2020).

Recently, deep learning (DL) has been introduced to solve small sample modeling problems due to its strong representation ability. The main idea of DL based data enhancement methods is to fill in the vacancies of the small-size original data using generated virtual samples. The popular DL models for this propose include variational autoencoder (VAE) (Boquet et al., 2020), generative adversarial network (GAN) (Han et al., 2018) and various variants of them (Creswell et al., 2018). For VAE based sample generation, most of the work focuses on learning the potential distribution of data. But in the case of complex data distribution, VAE is not suitable for large-scale generation and the model complexity will increase accordingly. Another possible issue of VAE is that the samples generated after decoding the randomly sampled latent variables do not completely follow the distribution of original samples. When using GAN for sample generation, the probability distribution of each point in sample space can be indirectly estimated through the game of generation and discrimination (Tian et al., 2019). In actual operation processes, however, the sampled noise inputs for the generator are random so that the network training is easy to fall into mode collapse. To tackle this issue, Dumoulin et al. (Dumoulin et al., 2016) introduced an adversarially learned inference (ALI) model by learning the respective advantages of a generation network and an inference network. This model has excellent performance in semi-supervised SVHN and CIFAR10 tasks. Even so, gradient explosion and disappearance are still common when training GAN. To this end, Du et al. (Du et al., 2021) proposed an improved model based on WGAN-gp to alleviate the training problem, and further improve the quality of generated images.

In addition to the above work focusing on image processing tasks, more and more novel and efficient generative models have begun to play an important role in the soft sensor applications. Wang and Liu (Wang and Liu, 2020) combined VAE and WGAN for constructing a generative model VA-WGAN, and then used the decoder of VAE to generate samples. VA-WGAN not only ensures the reliability of generated samples, but also promotes the smooth training of WGAN. To improve the ability of VAE in extracting deep features, Gao et al. (Gao et al., 2021) proposed a novel data enhancement method based on SVAE-WGAN, which achieves sample generation by stacking multiple VAEs. However, there still exist some challenges to obtain good-quality generated samples. First of all, generative models usually only generate the inputs of samples, while the outputs completely depend on the prediction of models built on labeled samples. The correlation between the input and output generated by this generation method is not strong enough to guarantee the accuracy of the prediction model, especially when the training sample size is small. Secondly, the data enhancement method based on DL cannot guarantee that all generated samples are from the original sample distribution. Those samples deviating from the original distribution will deteriorate the prediction performance of soft sensor models. Thirdly, the diversity of generated samples is also a key factor in constructing high-performance soft sensor models. In other words, even if training samples for generative model are the same, samples obtained by the generative model usually show a certain degree of randomness. This is not helpful for ensuring the stability of model predictions.

Aiming at the problems mentioned above, this paper proposes a novel generative model SV-WGANgp and develops a soft sensor modeling method DESE based on data enhancement and selective ensemble. SV-WGANgp is employed for generating labeled samples by combining SVAE and WGAN-gp. Further, DESE is built by integrating a group of enhanced GPR base models, which are trained based on the extended training samples and selected through multi-objective optimization. The effectiveness of the proposed SV-WGANgp and DESE method is verified on a numerical simulation case and an industrial case. Experimental results show that, by employing SV-WGANgp for data enhancement and multi-objective optimization for ensemble pruning, DESE method can significantly improve the prediction accuracy of soft sensor model on small sample problems. The main contributions of this work are twofold:

  • (1)

    A supervised sample generation method SV-WGANgp is proposed. Different from traditional generative models, which generate inputs and estimate outputs, SV-WGANgp can directly generate complete samples, i.e., generating inputs and outputs simultaneously. This greatly simplifies the sample generation process. In addition, compared with the methods of using supervised VAE or WGAN-gp alone, SV-WGANgp combines the powerful learning capability of SVAE on data distribution with the adversarial learning idea of WGAN-gp, thus achieving more stable generation of high-quality samples.

  • (2)

    A selective ensemble modeling framework is formulated by introducing data enhancement. The framework first constructs diverse enhanced base models through the sample perturbation mechanism based on training samples enlarged by the generated samples. Then, those enhanced base models that satisfy accuracy and diversity are selected through a multi-objective optimization ensemble pruning strategy. Finally, the selected base models are fused to improve the prediction accuracy and reliability. This framework can effectively overcome the challenge of scarce labeled data, negative impact of low-quality generated samples, as well as the instability of prediction model caused by randomness of generated samples.

The rest of this paper proceeds as follows. Section 2 provides a detailed description of the SVAE model, the WGAN-gp model and the proposed SV-WGANgp model. Section 3 presents the soft sensor method DESE based on data enhancement and selective ensemble. In Section 4, two case studies are reported to verify the effectiveness of SV-WGANgp and the DESE method. Finally, the conclusions are drawn in Section 5.

Section snippets

Proposed SV-WGANgp

In this section, SVAE and WGAN-gp are first briefly explained. Then, a detailed introduction of the proposed deep generative network SV-WGANgp is presented.

DESE method for soft sensor modeling

Though SV-WGANgp can be applied for generating samples in order to solve the problem of data scarcity, we still encounter two difficulties in soft sensor modeling. One is that the generated samples have uncertainties and it is not easy to select samples one by one. The other one is that it is difficult to obtain a high-performance model built with generated samples. To this end, we propose a soft sensor modeling method based on data enhancement and selective ensemble (DESE). On the one hand,

Case studies

To verify the effectiveness of the proposed sample generative model SV-WGANgp and soft sensor method DESE, an industrial chlortetracycline fermentation process and a simulated fed penicillin fermentation process are applied in this section. The methods used for comparison are as follows:

  • (1)

    GPR (Rasmussen and Williams): Gaussian process regression; where Matern covariance function with noise term is used.

  • (2)

    SVAE (Xie et al., 2019): supervised variational autoencoder based sample generation.

  • (3)

    WGAN-gp (

Conclusions

The main contributions of this paper are two-fold. On the one hand, a deep generative model SV-WGANgp is proposed. This model combines the advantages of SVAE and WGAN-gp to realize simultaneous generation of sample input and output. On the other hand, a novel soft sensor method DESE is developed for small sample environment by combining SV-WGANgp with selective ensemble learning. DESE first generates diverse extended labeled samples by SV-WGANgp, and further constructs a set of enhanced base

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

This work is supported in part by National Natural Science Foundation of China (NSFC) (62163019, 61863020), and the Applied Basic Research Project of Yunnan Province (202101AT070096).

References (55)

  • K. Qi et al.

    Joint sparse principal component regression with robust property

    Expert Syst. Appl.

    (2022)
  • M. Sarstedt et al.

    Latent class analysis in PLS-SEM: A review and recommendations for future applications

    J. Bus. Res.

    (2022)
  • C. Smith et al.

    Evolutionary multi-objective generation of recurrent neural network ensembles for time series prediction

    Neurocomputing

    (2014)
  • C. Tian et al.

    Data driven parallel prediction of building energy consumption using generative adversarial nets

    Energy Build.

    (2019)
  • X. Wang et al.

    Data supplement for a soft sensor using a new generative model based on a variational autoencoder and Wasserstein GAN

    J. Process Control

    (2020)
  • D. Xu et al.

    Convergence of the RMSProp deep learning method with penalty for nonconvex optimization

    Neural Networks

    (2021)
  • J. Yang et al.

    Online prediction for contamination of chlortetracycline fermentation based on Dezert-Smarandache theory

    Chin. J. Chem. Eng.

    (2015)
  • H. Yu et al.

    A novel constrained dense convolutional autoencoder and DNN-based semi-supervised method for shield machine tunnel geological formation recognition

    Mech. Syst. Sig. Process.

    (2022)
  • X. Yuan et al.

    A novel semi-supervised pre-training strategy for deep networks and its application for quality variable prediction in industrial processes

    Chem. Eng. Sci.

    (2020)
  • L. Zeng et al.

    Bayesian network for dynamic variable structure learning and transfer modeling of probabilistic soft sensor

    J. Process Control

    (2021)
  • M. Zhang et al.

    A method for capacity prediction of lithium-ion batteries under small sample conditions

    Energy

    (2022)
  • X.H. Zhang et al.

    Novel manifold learning based virtual sample generation for optimizing soft sensor with small data

    ISA Trans.

    (2021)
  • Arjovsky, M., Bottou, L., 2017. Towards principled methods for training generative adversarial networks. arXiv preprint...
  • M. Arjovsky et al.

    Wasserstein generative adversarial networks, in: International conference on machine learning

    PMLR

    (2017)
  • W. Cardoso et al.

    Artificial Neural Network for Predicting Silicon Content in the Hot Metal Produced in a Blast Furnace Fueled by Metallurgical Coke

    Mater. Res.

    (2022)
  • Z. Chai et al.

    A deep probabilistic transfer learning framework for soft sensor modeling with missing data

    IEEE Trans. Neural Networks Learn. Syst.

    (2022)
  • A. Creswell et al.

    Generative adversarial networks: An overview

    IEEE Signal Process Mag.

    (2018)
  • View full text