Soft sensor modeling for small data scenarios based on data enhancement and selective ensemble
Introduction
Process industry is an important pillar of national economic and social development, which provides raw materials and energy for downstream manufacturing (Yuan et al., 2018, Zhu and Goldberg, 2009). In order to achieve smart and optimal manufacturing in the process industry, it is appealing to implement advanced process monitoring, control and optimization technologies, which highly rely on real-time measurements of key product quality. However, many key quality variables cannot be measured in real time due to the lack of online measurement instruments, or difficulties and high cost of measurements in harsh environment (Yuan et al., 2020). As a promising indirect measurement method, soft sensor can realize the estimation of difficult-to-measure variables by constructing a mathematical model between secondary variables and primary variable. This method can effectively overcome the problems of offline analysis delay and large sampling interval, which allows estimating those quality variables that are highly relevant to product efficiency (Fortuna et al., 2007). Over the past decades, soft sensor technology has been widely used in chemical, biological, metallurgical industries, etc., because of its advantages of low cost, strong anti-interference ability, easy maintenance and high real-time performance (Cardoso et al., 2022, Haimi et al., 2013, Liu et al., 2018).
Soft sensor modeling approaches are generally divided into mechanistic and data-driven modeling. The former methods require grasping in-depth internal biochemical reaction mechanisms of production processes, and aim to establish accurate mathematical models. However, it is difficult and even impossible to build accurate mathematical models because industrial processes often encounter many problems such as strong nonlinearity, high complexity and uncertainty. In contrast, data-driven modeling technology only needs to mine effective information from historical data for catching the mathematical relationship between secondary variables and primary variable (Shi et al., 2020). Benefiting from the rapid development of computational intelligence and process data analytics technology, data-driven soft sensor modeling methods have attracted growing attentions (Jiang et al., 2020, Zeng and Ge, 2021, Jin et al., 2019a). The commonly used data-driven soft sensor models mainly include principal component regression (PCR), partial least squares (PLS), support vector regression (SVR), Gaussian process regression (GPR), artificial neural networks (ANN), deep neural networks (DNN), etc (Dai et al., 2022, Fezai et al., 2020, Liu et al., 2022, Qi et al., 2022, Sarstedt et al., 2022, Yu et al., 2022).
With the rapid development of information and communication technology, we have ushered in the cloud era, where massive amounts of data are flooding into all walks of life. Unfortunately, the small sample problems still exist and remain challenging to construct well-performing data-driven soft sensor models. The reasons for this situation mainly come from two aspects. On the one hand, the samples obtained from the stable production process are usually similar, which make it difficult to extract sufficient and valuable information (Li et al., 2014). On the other hand, due to the small probability of data occurrence, and the difficulty and high cost of data acquisition, the process data available for modeling and analysis are often limited (Chai et al., 2022).
The methods proposed for addressing small sample problems can been roughly divided into three types: grey model based methods, feature extraction based methods and sample generation based data enhancement methods (Tian et al., 2021). Among them, grey model (Ma and Liu, 2018) is abstracted from the grey system, which contains both known information and unknown or uncertain information. It is an effective tool to deal with small sample problems by making full use of a small amount of incomplete information to establish mathematical models and make predictions. However, gray model based methods have a strong dependence on historical data, and they are only suitable for short and medium-term prediction. In addition, since the correlation between factors within the system is ignored, these methods are prone to produce large prediction deviations. For high-dimensional small data in industrial processes, the latent variable information can be extracted for data dimensionality reduction (He et al., 2022, Huang and Versaci, 2021). Specifically, feature extraction based methods apply interpolation algorithm to obtain virtual inputs on extracted data features, thereby realizing the expansion of small samples. Nevertheless, some important information may be lost after feature extraction, which ultimately restricts the improvement of model performance. Alternatively, sample generation based data enhancement methods have been proposed recently to generate completely different samples by learning the features or distributions of original samples. Some recent studies have shown that expanding the generated samples to the original sample set can enhance the soft sensor performance (Chen et al., 2017, Zhang et al., 2021, Zhang et al., 2022, Zhu et al., 2020).
Recently, deep learning (DL) has been introduced to solve small sample modeling problems due to its strong representation ability. The main idea of DL based data enhancement methods is to fill in the vacancies of the small-size original data using generated virtual samples. The popular DL models for this propose include variational autoencoder (VAE) (Boquet et al., 2020), generative adversarial network (GAN) (Han et al., 2018) and various variants of them (Creswell et al., 2018). For VAE based sample generation, most of the work focuses on learning the potential distribution of data. But in the case of complex data distribution, VAE is not suitable for large-scale generation and the model complexity will increase accordingly. Another possible issue of VAE is that the samples generated after decoding the randomly sampled latent variables do not completely follow the distribution of original samples. When using GAN for sample generation, the probability distribution of each point in sample space can be indirectly estimated through the game of generation and discrimination (Tian et al., 2019). In actual operation processes, however, the sampled noise inputs for the generator are random so that the network training is easy to fall into mode collapse. To tackle this issue, Dumoulin et al. (Dumoulin et al., 2016) introduced an adversarially learned inference (ALI) model by learning the respective advantages of a generation network and an inference network. This model has excellent performance in semi-supervised SVHN and CIFAR10 tasks. Even so, gradient explosion and disappearance are still common when training GAN. To this end, Du et al. (Du et al., 2021) proposed an improved model based on WGAN-gp to alleviate the training problem, and further improve the quality of generated images.
In addition to the above work focusing on image processing tasks, more and more novel and efficient generative models have begun to play an important role in the soft sensor applications. Wang and Liu (Wang and Liu, 2020) combined VAE and WGAN for constructing a generative model VA-WGAN, and then used the decoder of VAE to generate samples. VA-WGAN not only ensures the reliability of generated samples, but also promotes the smooth training of WGAN. To improve the ability of VAE in extracting deep features, Gao et al. (Gao et al., 2021) proposed a novel data enhancement method based on SVAE-WGAN, which achieves sample generation by stacking multiple VAEs. However, there still exist some challenges to obtain good-quality generated samples. First of all, generative models usually only generate the inputs of samples, while the outputs completely depend on the prediction of models built on labeled samples. The correlation between the input and output generated by this generation method is not strong enough to guarantee the accuracy of the prediction model, especially when the training sample size is small. Secondly, the data enhancement method based on DL cannot guarantee that all generated samples are from the original sample distribution. Those samples deviating from the original distribution will deteriorate the prediction performance of soft sensor models. Thirdly, the diversity of generated samples is also a key factor in constructing high-performance soft sensor models. In other words, even if training samples for generative model are the same, samples obtained by the generative model usually show a certain degree of randomness. This is not helpful for ensuring the stability of model predictions.
Aiming at the problems mentioned above, this paper proposes a novel generative model SV-WGANgp and develops a soft sensor modeling method DESE based on data enhancement and selective ensemble. SV-WGANgp is employed for generating labeled samples by combining SVAE and WGAN-gp. Further, DESE is built by integrating a group of enhanced GPR base models, which are trained based on the extended training samples and selected through multi-objective optimization. The effectiveness of the proposed SV-WGANgp and DESE method is verified on a numerical simulation case and an industrial case. Experimental results show that, by employing SV-WGANgp for data enhancement and multi-objective optimization for ensemble pruning, DESE method can significantly improve the prediction accuracy of soft sensor model on small sample problems. The main contributions of this work are twofold:
- (1)
A supervised sample generation method SV-WGANgp is proposed. Different from traditional generative models, which generate inputs and estimate outputs, SV-WGANgp can directly generate complete samples, i.e., generating inputs and outputs simultaneously. This greatly simplifies the sample generation process. In addition, compared with the methods of using supervised VAE or WGAN-gp alone, SV-WGANgp combines the powerful learning capability of SVAE on data distribution with the adversarial learning idea of WGAN-gp, thus achieving more stable generation of high-quality samples.
- (2)
A selective ensemble modeling framework is formulated by introducing data enhancement. The framework first constructs diverse enhanced base models through the sample perturbation mechanism based on training samples enlarged by the generated samples. Then, those enhanced base models that satisfy accuracy and diversity are selected through a multi-objective optimization ensemble pruning strategy. Finally, the selected base models are fused to improve the prediction accuracy and reliability. This framework can effectively overcome the challenge of scarce labeled data, negative impact of low-quality generated samples, as well as the instability of prediction model caused by randomness of generated samples.
The rest of this paper proceeds as follows. Section 2 provides a detailed description of the SVAE model, the WGAN-gp model and the proposed SV-WGANgp model. Section 3 presents the soft sensor method DESE based on data enhancement and selective ensemble. In Section 4, two case studies are reported to verify the effectiveness of SV-WGANgp and the DESE method. Finally, the conclusions are drawn in Section 5.
Section snippets
Proposed SV-WGANgp
In this section, SVAE and WGAN-gp are first briefly explained. Then, a detailed introduction of the proposed deep generative network SV-WGANgp is presented.
DESE method for soft sensor modeling
Though SV-WGANgp can be applied for generating samples in order to solve the problem of data scarcity, we still encounter two difficulties in soft sensor modeling. One is that the generated samples have uncertainties and it is not easy to select samples one by one. The other one is that it is difficult to obtain a high-performance model built with generated samples. To this end, we propose a soft sensor modeling method based on data enhancement and selective ensemble (DESE). On the one hand,
Case studies
To verify the effectiveness of the proposed sample generative model SV-WGANgp and soft sensor method DESE, an industrial chlortetracycline fermentation process and a simulated fed penicillin fermentation process are applied in this section. The methods used for comparison are as follows:
- (1)
GPR (Rasmussen and Williams): Gaussian process regression; where Matern covariance function with noise term is used.
- (2)
SVAE (Xie et al., 2019): supervised variational autoencoder based sample generation.
- (3)
WGAN-gp (
Conclusions
The main contributions of this paper are two-fold. On the one hand, a deep generative model SV-WGANgp is proposed. This model combines the advantages of SVAE and WGAN-gp to realize simultaneous generation of sample input and output. On the other hand, a novel soft sensor method DESE is developed for small sample environment by combining SV-WGANgp with selective ensemble learning. DESE first generates diverse extended labeled samples by SV-WGANgp, and further constructs a set of enhanced base
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgments
This work is supported in part by National Natural Science Foundation of China (NSFC) (62163019, 61863020), and the Applied Basic Research Project of Yunnan Province (202101AT070096).
References (55)
- et al.
A variational autoencoder solution for road traffic forecasting systems: Missing data imputation, dimension reduction, model selection and anomaly detection
Transport. Res. C-Emer.
(2020) - et al.
A PSO based virtual sample generation method for small sample sets: Applications to regression datasets
Eng. Appl. Artif. Intell.
(2017) - et al.
Prediction of water quality based on SVR by fluorescence excitation-emission matrix and UV–Vis absorption spectrum
Spectrochim. Acta, Part A
(2022) - et al.
Data-derived soft-sensors for biological wastewater treatment plants: An overview
Environmental Modelling & Software
(2013) - et al.
Enhanced virtual sample generation based on manifold features: Applications to developing soft sensor using small data
ISA Trans.
(2022) - et al.
Ensemble just-in-time learning framework through evolutionary multi-objective optimization for soft sensor development of nonlinear industrial processes
Chemom. Intell. Lab. Syst.
(2019) - et al.
Improving learning accuracy by using synthetic samples for small datasets with non-linear attribute dependency
Decis. Support Syst.
(2014) - et al.
ANN-Based vibration control of an aerial refueling hose system with input nonlinearity and prescribed output constraint
J. Franklin Inst.
(2022) - et al.
Ensemble deep kernel learning with application to quality prediction in industrial polymerization processes
Chemom. Intell. Lab. Syst.
(2018) - et al.
The kernel-based nonlinear multivariate grey model
Appl. Math. Modell.
(2018)
Joint sparse principal component regression with robust property
Expert Syst. Appl.
Latent class analysis in PLS-SEM: A review and recommendations for future applications
J. Bus. Res.
Evolutionary multi-objective generation of recurrent neural network ensembles for time series prediction
Neurocomputing
Data driven parallel prediction of building energy consumption using generative adversarial nets
Energy Build.
Data supplement for a soft sensor using a new generative model based on a variational autoencoder and Wasserstein GAN
J. Process Control
Convergence of the RMSProp deep learning method with penalty for nonconvex optimization
Neural Networks
Online prediction for contamination of chlortetracycline fermentation based on Dezert-Smarandache theory
Chin. J. Chem. Eng.
A novel constrained dense convolutional autoencoder and DNN-based semi-supervised method for shield machine tunnel geological formation recognition
Mech. Syst. Sig. Process.
A novel semi-supervised pre-training strategy for deep networks and its application for quality variable prediction in industrial processes
Chem. Eng. Sci.
Bayesian network for dynamic variable structure learning and transfer modeling of probabilistic soft sensor
J. Process Control
A method for capacity prediction of lithium-ion batteries under small sample conditions
Energy
Novel manifold learning based virtual sample generation for optimizing soft sensor with small data
ISA Trans.
Wasserstein generative adversarial networks, in: International conference on machine learning
PMLR
Artificial Neural Network for Predicting Silicon Content in the Hot Metal Produced in a Blast Furnace Fueled by Metallurgical Coke
Mater. Res.
A deep probabilistic transfer learning framework for soft sensor modeling with missing data
IEEE Trans. Neural Networks Learn. Syst.
Generative adversarial networks: An overview
IEEE Signal Process Mag.
Cited by (5)
Nonlinear free vibration impact on the smart small-scale thermo-mechanical sensors for monitoring the information in sports application
2024, Steel and Composite StructuresMissing data filling in soft sensing using denoising diffusion probability model
2024, Measurement Science and TechnologySoft Sensing of LPG Processes Using Deep Learning
2023, SensorsAn Enhancement Method in Few-Shot Scenarios for Intrusion Detection in Smart Home Environments
2023, Electronics (Switzerland)