Forecasting industrial aging processes with machine learning methods

https://doi.org/10.1016/j.compchemeng.2020.107123Get rights and content

Highlights

  • Machine learning models can be used to accurately forecast industrial aging processes.

  • Multiple machine learning models were tested using synthetic and real-world data.

  • Good performance of recurrent networks shows that having temporal context is crucial.

  • Recurrent networks maintain good performance even when dealing with smaller datasets.

Abstract

Accurately predicting industrial aging processes makes it possible to schedule maintenance events further in advance, ensuring a cost-efficient and reliable operation of the plant. So far, these degradation processes were usually described by mechanistic or simple empirical prediction models. In this paper, we evaluate a wider range of data-driven models, comparing some traditional stateless models (linear and kernel ridge regression, feed-forward neural networks) to more complex recurrent neural networks (echo state networks and LSTMs). We first examine how much historical data is needed to train each of the models on a synthetic dataset with known dynamics. Next, the models are tested on real-world data from a large scale chemical plant. Our results show that recurrent models produce near perfect predictions when trained on larger datasets, and maintain a good performance even when trained on smaller datasets with domain shifts, while the simpler models only performed comparably on the smaller datasets.

Introduction

Aging of critical assets is an omnipresent phenomenon in any production environment, causing significant maintenance expenditures or leading to production losses. The understanding and anticipation of the underlying degradation processes is therefore of great importance for a reliable and economic plant operation, both in discrete manufacturing and in the process industry.

With a focus on the chemical industry, notorious aging phenomena include the deactivation of heterogeneous catalysts (Forzatti and Lietti, 1999) due to coking (Barbier, 1986), sintering (Harris, 1995), or poisoning (Nielsen, 1995); plugging of process equipment, such as heat exchangers or pipes, on process side due to coke layer formation (Cai et al., 2002) or polymerization (Wu et al., 2018); fouling of heat exchangers on water side due to microbial or crystalline deposits (Müller-Steinhagen, 2000); erosion of installed equipment, such as injection nozzles or pipes, in fluidized bed reactors (Wang, 1996, Werther, 2000); corrosion inside pipes or vessels (Nei, 2007); and more.

Despite the large variety of affected asset types in these examples, and the completely different physical or chemical degradation processes that underlie them, all of these phenomena share some essential characteristics:

  • 1.

    The considered critical asset has one or more key performance indicators (KPIs), which quantify the progress of degradation.1

  • 2.

    On a time scale much longer than the typical production time scales (i.e., batch time for discontinuous processes; typical time between set point changes for continuous processes), the KPIs drift more or less monotonically to ever higher or lower values, indicating the occurrence of an irreversible degradation phenomenon. (On shorter time scales, the KPIs may exhibit fluctuations that are not driven by the degradation process itself, but rather by varying process conditions or background variables such as, e.g., the ambient temperature.)

  • 3.

    The KPIs return towards their baseline after maintenance events, such as cleaning of a fouled heat exchanger, replacement of an inactive catalyst, etc.

  • 4.

    The degradation is no ‘bolt from the blue’ – such as, e.g., the bursting of a flawed pipe –, but is rather driven by creeping, inevitable wear and tear of process equipment.

Any aging phenomenon with these general properties is addressed by the present work.

Property (4) suggests that the evolution of a degradation KPI is to a large extent determined by the process conditions, and not by uncontrolled, external factors. This sets the central goal of the present work: To forecast the evolution of the degradation KPI over a certain time horizon, given the planned process conditions in this time frame. If instead external factors such as humidity, ambient temperature, human interventions, etc. are the dominating driving forces of an aging process, the presented approach is not applicable.

For virtually any important aging phenomenon in chemical engineering, the respective scientific community has developed a detailed understanding of their microscopic and macroscopic driving forces. This understanding has commonly been condensed into sophisticated mathematical models. Examples of such mechanistic degradation models deal with coking of steamcracker furnaces (Berreni and Wang, 2011, De Schepper, Heynderickx, Marin, 2010, Gao, Wang, Pantelides, Li, Yeung, 2009), sintering (Li et al., 2017, Ruckenstein, Pulvermacher, 1973) or coking (Froment, 2001) of heterogeneous catalysts, or crystallization fouling of heat exchangers (Brahim et al., 2003).

While these models give valuable insights into the dynamics of experimentally non-accessible quantities, and can help to verify or falsify hypotheses about the degradation mechanism in general, they are usually not (or only with significant modeling effort) transferable to the specific environment in a real-world apparatus: Broadly speaking, they often describe ‘clean’ observations of the degradation process in a lab environment, and do not reflect the ‘dirty’ reality in production, where additional effects come into play that are hard or impossible to model mechanistically. To mention only one example, sintering dynamics of supported metal catalysts are hard to model quantitatively even in the ‘clean’ system of Wulff-shaped particles on a flat surface (Li et al., 2017) – while in real heterogeneous catalysts, surface morphology and particle shape may deviate strongly from this assumption. Another disadvantage of mechanistic models is that their numerical solution can be computationally expensive or even intractable.

While the latter issue of computational complexity can be mitigated by surrogate modeling methods (Wang, Ierapetritou, 2018, Kim, Boukouvala, 2020), the former issue of real-world complexity is the main reason why mechanistic models of degradation dynamics are rarely used in a production environment.

This weakness is addressed by hybrid process models (also referred to as gray-box models), which combine mechanistic with data driven modeling approaches to bridge the gap between idealized mechanistic models and the real world (Von Stosch, Oliveira, Peres, de Azevedo, 2014, Willis, von Stosch, 2017, Asprion, Böttcher, Pack, Stavrou, Höller, Schwientek, Bortz, 2019, Zendehboudi, Rezaei, Lohi, 2018, Glassey, von Stosch, 2018). However, to the best of our knowledge, these models have not yet been used to forecast industrial aging processes in chemical plants.

When dealing with the complexity of modeling real-world aging processes, statistical approaches have proven successful in a variety of applications. For example, data-driven methods for fault detection of chemical plants (Russell et al., 2012), such as multivariate anomaly detection with Fisher discriminant (Chiang et al., 2000) or principal component analysis (Kresta et al., 1991), are routinely applied nowadays to monitor process equipment. However, we emphasize that most of these applications focus on the detection or monitoring of degradation, not on the prediction of its progression.

Publications of data-driven models that predict degradation dynamics predominantly address specific aging phenomena, such as batch-to-batch fouling of heat exchangers as a function of the polymer type produced in the respective batch (Wu et al., 2018), or fouling of the crude preheat train in petroleum refineries (Radhakrishnan, Ramasamy, Zabiri, Do Thanh, Tahir, Mukhtar, Hamdi, Ramli, 2007, Aminian, Shahhosseini, 2008). To the best of our knowledge, all published work in this field is either based on classical statistical regression methods, such as ordinary or partial least squares (Wu et al., 2018), or on small-scale machine learning (ML) methods, such as small feed-forward neural networks (FFNN) trained with limited datasets (Radhakrishnan, Ramasamy, Zabiri, Do Thanh, Tahir, Mukhtar, Hamdi, Ramli, 2007, Aminian, Shahhosseini, 2008). So far, advanced ML algorithms, such as recurrent neural networks (RNN), trained with years or even decades of historical plant data, have not been studied in depth in the context of predicting degradation of chemical process equipment (Lee et al., 2018). It is the aim of this work to investigate the prospects of advanced ML methods for this problem, compare them to classical regression methods and understand potential limitations.

The rest of this paper is structured as follows: First, we formalize the general IAP problem setting (Section 2). Then we describe our two datasets (Section 3), as well as quickly introduce the five ML models that we evaluated for this task (Section 4). Finally, we present the prediction results of the different models on both datasets (Section 5) and conclude the paper with a discussion (Section 6).

Section snippets

Problem definition

The general industrial aging process (IAP) forecasting problem is illustrated in Fig. 1: The aim is to model the evolution of one or several degradation KPIs y(t) as a function of the planned process conditions x(t) over the time frame of an entire degradation cycle, i.e., until the next required maintenance event. Formally, the problem statement reads:

Given A time window t ∈ [0, Ti] between two maintenance events, referred to as the ith degradation cycle, and the planned process conditions xi(t

Datasets

To gain insights into and evaluate different ML models for the IAP forecasting problem, we consider two datasets: one synthetic, which we generated ourselves using a mechanistic model, and one containing real-world data from a large plant at BASF. Both datasets are described in more detail below.

The reason for working with synthetic data is that this allows us to control two important aspects of the problem: data quantity and data quality. Data quantity is measured, e.g., by the number of

Machine learning methods

We now frame the IAP forecasting problem described in Section 2 as a machine learning problem, by defining a concrete function f that returns y^i(t), an estimate of the KPIs at a time point t in the ith degradation cycle, based on the process conditions xi at this time point as well as possibly up to k hours before t:y^i(t)=f(xi(t)[,xi(t1),,xi(tk)])t[0,,Ti].The task is to predict yi(t) for the complete cycle (i.e., up to Ti), typically starting from about 24 h after the last maintenance

Results

In this section, we report our evaluation of the five different ML models introduced in Section 4 using the synthetic and real-world datasets described in Section 3. To measure the prediction errors of the ML models, we use the mean squared error (MSE), which, due to the subdivision of our datasets into cycles, we define slightly differently than usual: Let the dataset D be composed of N cycles, and let yi(t) denote the KPIs at time point t{0,,Ti} within the ith cycle, where Ti is the length

Discussion

Formulating accurate mathematical models of industrial aging processes (IAP) is essential for predicting when critical assets need to be replaced or restored. In world-scale chemical plants such predictions can be of great economic value, as they increase plant reliability and efficiency. While mechanistic models are useful for elucidating the influencing factors of degradation processes under laboratory conditions, it is notoriously difficult to adapt them to the specific circumstances of

CRediT authorship contribution statement

Mihail Bogojeski: Conceptualization, Methodology, Software, Formal analysis, Investigation, Visualization, Writing - original draft. Simeon Sauer: Conceptualization, Methodology, Data curation, Formal analysis, Supervision, Funding acquisition, Writing - original draft. Franziska Horn: Conceptualization, Methodology, Software, Formal analysis, Visualization, Investigation, Writing - original draft. Klaus-Robert Müller: Conceptualization, Methodology, Supervision, Funding acquisition, Project

Declaration of Competing Interest

No.

Acknowledgments

KRM was supported in part  by the Institute of Information & Communications Technology Planning & Evaluation (IITP) grant funded by the Korea Government (No. 2019-0-00079,  Artificial Intelligence Graduate School Program, Korea University), and was partly supported by the German Ministry for Education and Research (BMBF) under Grants 01IS14013A-E, 01GQ1115, 01GQ0850, 01IS18025A, 031L0207D and 01IS18037A; the German Research Foundation (DFG) under Grant Math+, EXC 2046/1, Project ID 390685689.

References (82)

  • G.-Y. Gao et al.

    Mathematical modeling and optimal operation of industrial tubular reactor for naphtha cracking

    Computer Aided Chemical Engineering

    (2009)
  • R. Hecht-Nielsen

    Theory of the backpropagation neural network

    Neural Networks for Perception

    (1992)
  • K. Hornik et al.

    Multilayer feedforward networks are universal approximators

    Neural Netw.

    (1989)
  • S.H. Kim et al.

    Surrogate-based optimization for mixed-integer nonlinear problems

    Comput. Chem. Eng.

    (2020)
  • S. Kolassa

    Evaluating predictive count data distributions in retail sales forecasting

    Int. J. Forecast.

    (2016)
  • J.H. Lee et al.

    Machine learning: overview of the recent progresses and implications for the process systems engineering field

    Comput. Chem. Eng.

    (2018)
  • G. Montavon et al.

    Explaining nonlinear classification decisions with deep Taylor decomposition

    Pattern Recognit.

    (2017)
  • G. Montavon et al.

    Methods for interpreting and understanding deep neural networks

    Digit. Signal Process.

    (2018)
  • D. Nemec et al.

    Flow through packed bed reactors: 1. Single-phase flow

    Chem. Eng. Sci.

    (2005)
  • V. Radhakrishnan et al.

    Heat exchanger fouling model and preventive maintenance scheduling tool

    Appl. Therm. Eng.

    (2007)
  • E. Ruckenstein et al.

    Growth kinetics and the size distributions of supported metal crystallites

    J. Catal.

    (1973)
  • J. Schmidhuber

    Deep learning in neural networks: an overview

    Neural Netw.

    (2015)
  • H.T. Siegelmann et al.

    On the computational power of neural nets

    J. Comput. Syst. Sci.

    (1995)
  • M. Von Stosch et al.

    Hybrid semi-parametric modeling in process systems engineering: past, present and future

    Comput. Chem. Eng.

    (2014)
  • B. Wang

    Erosion-corrosion of thermal sprayed coatings in FBC boilers

    Wear

    (1996)
  • Z. Wang et al.

    Constrained optimization of black-box stochastic systems using a novel feasibility enhanced kriging-based method

    Comput. Chem. Eng.

    (2018)
  • M.J. Willis et al.

    Simultaneous parameter identification and discrimination of the nonparametric structure of hybrid semi-parametric models

    Comput. Chem. Eng.

    (2017)
  • O. Wu et al.

    Data-driven degradation model for batch processes: a case study on heat exchanger fouling

    Computer Aided Chemical Engineering

    (2018)
  • S. Zendehboudi et al.

    Applications of hybrid models in chemical, petroleum, and energy systems: a systematic review

    Appl. Energy

    (2018)
  • L. Arras et al.

    Explaining recurrent neural network predictions in sentiment analysis

    Proceedings of the 8th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis

    (2017)
  • N. Asprion et al.

    Gray-box modeling for the optimization of chemical processes

    Chem. Ing. Tech.

    (2019)
  • W. Aswolinskiy et al.

    Unsupervised transfer learning for time series via self-predictive modelling-first results

    Proceedings of the Workshop on New Challenges in Neural Computation (NC2)

    (2017)
  • S. Bach et al.

    On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation

    PLoS ONE

    (2015)
  • Y. Bengio et al.

    Advances in optimizing recurrent networks

    2013 IEEE International Conference on Acoustics, Speech and Signal Processing

    (2013)
  • Bianchi, F. M., Maiorino, E., Kampffmeyer, M. C., Rizzi, A., Jenssen, R., 2017. An overview and comparative analysis of...
  • C.M. Bishop

    Pattern Recognition and Machine Learning

    (2006)
  • C.M. Bishop

    Neural Networks for Pattern Recognition

    (1995)
  • D.M. Blei et al.

    Variational inference: a review for statisticians

    J. Am. Stat. Assoc.

    (2017)
  • A. Choromanska et al.

    The loss surfaces of multilayer networks

    Artificial Intelligence and Statistics

    (2015)
  • S.C. De Schepper et al.

    Modeling the coke formation in the convection section tubes of a steam cracker

    Ind. Eng. Chem. Res.

    (2010)
  • J. Donahue et al.

    Long-term recurrent convolutional networks for visual recognition and description

    Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

    (2015)
  • Cited by (16)

    • Hybrid modelling of a batch separation process

      2023, Computers and Chemical Engineering
    View all citing articles on Scopus
    View full text