Introduction

Improving process monitoring is a common need within the process industry, including chemical, food, biochemical and pharmaceuticals (Boyd and Varley 2001). To further advance the field, innovation is needed not only in the theoretical foundations but also in identifying new technical solutions (Hu et al. 2015). Desirable features for process measurements are to be non-invasive and suitable for in-line application with real time response, in order to avoid delays in intervening with control measures. Amongst several potential sensing methods, acoustic emission (AE) is a low cost, data-rich technique with applicability for in-line monitoring. Traditionally acoustic techniques are categorised as either with active or passive acoustics. The former consists of a transmitter generating an acoustic wave within the system and a receiver acquiring the response of the stimulated system. The latter, also known as acoustic emission (AE), is composed only by a sensor recording acoustic waves generated by the process itself.

The signal will often be a combination of numerous acoustic events all propagating to the sensor via different paths. Most of AE research has focused on fault detection (leakage, failure,..) (Boyd and Varley 2001), corrosion (Cole and Watson 2005), grinding (Griffin and Chen 2016) and tool wearing (Elforjani and Shanbr 2018; Li et al. 2015). More recent studies have also investigated AE as a means of monitoring physico-chemical changes within processes involving powder and fluids. For example, Aldrich et al. (Aldrich and Theron 2000) have used directional microphones to estimate particle size in a ball mill using continuous regression. Other studies have applied AE to powder pneumatic conveying (Esbensen et al. 1998) and V-blenders (Crouter and Briens 2015). Applications in multiphase mixing are also reported in the literature, Nordon et al. (2004), for example, applied AE to a jacketed stirred tank for monitoring of a heterogeneous reaction. While the first known reference on gas–liquid flow dates from the 1920s (Bragg Sir 1921), more recent works (Addali et al. 2010) have used AE energy information to predict gas phase fraction in a two phase (air–water) slug flow.

Although, many researchers have investigated various methods to determine multiphase mixing regimes in stirred tanks in real time, as reported in subsequent sections, their applications have been limited to R&D. This is mainly due to difficulties in retrofitting the devices to existing plants as well as to slow responses. In this work, AE is applied to monitor gas–liquid mixing in a 3L stirred tank, with the objective to identify the operating bubble dispersion regime. This work aims to propose a methodology to obtain real time information by installing an acoustic piezoelectric sensor on the outside of a stirred tank. The data-richness and high time resolution of the used technique represent desirable features for modern control system strategies.

Gas–liquid mixing

Gas–liquid reactors are very common unit operations in the biochemical and chemical industries, where ensuring appropriate interphase contact and gas dispersion is critical. Different regimes are observed when sparging gas into stirred tanks; namely flooded, loaded, completely dispersed, and gas recirculation (Nienow et al. 1985) depending on gas flow rate, tank design and fluids properties (e.g. rheology, density, interphase tension). The different regimes (shown in Fig. 1) characterise how well the gas is sparged within the volume of the tank: contact between the two phases increases moving from the flooded (a) towards the completely dispersed condition (c). In the flooding regime the impeller speed N does not overcome the flooding critical speed Nf, therefore the gas dispersion is minimum, where a plume of gas rises from the sparger up to the surface along the impeller. At higher impeller speeds when the N is higher than Nf, but lower than Ncd (the completely dispersed critical speed), the system is in the loading regime; where the gas is more dispersed but does not cover the whole volume of the tank as the radial drag force is overcome by the buoyancy forces. At higher impeller speed (Ncd < N < Nr), in the completely dispersed regime, the gas is well dispersed through the whole vessel at low power, but it is beyond Nr (the recirculation critical speed) that the full gas recirculation regime is observed. The latter two are generally the desirable conditions for unit operations requiring good contact between the two phases.

Fig. 1
figure 1

Pictorial representation of the main gas–liquid regime: a flooded, b loaded and c completely dispersed (Nienow et al. 1985)

Commonly, the operating regime is estimated based on a characteristic regime chart (Middleton 1992) based on the definition of two dimensionless numbers: the Froude number \( \left( {Fr = N^{2} D/g} \right) \), defined as the ratio of the flow intertia to the gravity field and the gas flow number (Warmoeskerken and Smith 1985) \( \left( {Fl = Q_{g} /\left( {ND^{3} } \right)} \right) \) representing the ratio of the gas inlet rate to the impeller pumping.

Sensing and measurement techniques

Existing measurement techniques for identifying the different regimes are multiple, including optical (Yawalkar et al. 2002), level probes (Gao et al. 2001), ultrasonics (Cents et al. 2005) and tomography [Electrical Resistance ERT (Forte et al. 2019; Jamshed et al. 2018), X-ray (Ford et al. 2008), γ-ray (Veera et al. 2001)], but are often limited by practical considerations (invasiveness, retrofitting, opacity, safety) and cost issues. The mentioned systems all require high investment cost in terms of devices and operations. They are also usually fixed solutions characterized by rather heavy retrofitting. Furthermore, in cases of high energy based methods (X-ray, γ-ray) an additional cost is represented by the safety measures needed to ensure safe use of the technique. The AE device is, instead, composed by a simple hardware which does not require retrofitting and is also a portable solution. Indeed, the sensor can be placed on a tank and moved to another quickly, being installed on the outside wall. This, together with the low investment and operational cost make it convenient to use and furthermore devoid of safety issues. The AE system applied in this study has, therefore, the potential to be a cheap, easy to install (and retrofit) alternative technique that can offer reliable information on the operating gas–liquid regime.

In this work, AE emission data are acquired at loaded and completely dispersed regimes as well as in ungassed condition, to simulate accidental shut down of air feed. Machine learning (ML) supervised algorithms are used to identify and recognize the different conditions. To evaluate the robustness of the technique in correctly identifying the operating regime, a deviation from the biphasic condition is also introduced, by adding to the bulk solid particles up to a concentration of 5% w/w. This occurrence is not unusual in case of chemical reactions, as for example solid precipitation, occurring from mixing reactive gas–liquid mixtures in stirred tanks (Zhao et al. 2016, 2017). The presence of solids will cause an increase in the emitted signal because of impacts of the particles with the wall, the impeller and with each other, but the system will be challenged to correctly predict the gas–liquid operating regime. The regime is considered unchanged at fixed air volumetric flow and impeller speed: for low concentration [< 20% w/w (Chapman et al. 1983b)] of solids, works in the literature report little effects of the presence of the particles on the gas sparging dynamics (Bao et al. 2006; Chapman et al. 1983b).

Machine learning algorithms

The use of machine learning algorithms in AE applications has recently and successfully been applied by other researchers in the literature for condition monitoring of cutting tools (Chen et al. 2011; Li et al. 2015), gear box fault diagnosis (Li et al. 2016) and for data clustering (Pomponi and Vinogradov 2013). The implementation of such techniques in chemical and manufacturing industry has recently seen increasing interest since such data-driven and statistical approaches fit well with the smart manufacturing/Industry 4.0 trend (Wuest et al. 2016).

Techniques involving Artificial Intelligence (AI) have been applied in several fields including, game playing, robotics, facial and speech recognition (Venkatasubramanian 2019) with the first attempt to chemical engineering applications reported in the 80s for catalyst design (Banares-Alcantara et al. 1985; Bañares-Alcántara et al. 1987), application that is still object of current studies (Ulissi et al. 2017). Recent applications in researches of industrial interest can be found in estimation of physical properties of organic molecules (National Academies of Sciences 2018), shape memory alloys design (Xue et al. 2016) and studies on colloidal self-assembly systems (Spellings and Glotzer 2018). The spreading of such investigations have in the last years interested a much wider range of operations thanks to cheaper and faster calculators and fast access to large memory storage (cloud) (Venkatasubramanian 2019). Furthermore, also the psychological barriers within organization, both within management and manufacturing have lowered thanks to spreading of AI devices within domestic and personal application (Amazon Alexa is an example).

Many ML techniques including both unsupervised and supervised algorithms have been developed. In this study, some of the more common machine learning methods are used for processing the AE data in the frequency domain, to recognize the different operative conditions; solving a classification problem (Wu et al. 2008). The herein used algorithms are: logistic regression, support vector machine (SVM), k-nearest neighbour (k-NN) and decision tree; of which the last three were implemented in MATLAB® classification learner application within the Statistic and Machine Learning toolbox.

Logistic regression has already been applied in processing acoustic emission data for predicting reliability of cutting tools and for condition monitoring of bearing elements (Rozak et al. 2018) to provide adequate maintenance schedules (Li et al. 2015). It is a nonlinear statistical method which has been extensively used also beyond machine reliability and life prediction (Caesarendra et al. 2010; Yan et al. 2004; Yan and Lee 2004) in the field of economics (Martin 1977) and health (Bender and Kuss 2010). The algorithm itself can be defined as a binomial regression (Hilbe 2009), where the output is a probability \( h_{\theta } \) of a condition being verified, in the specific case the belonging of the signal to one specific class. Logistic regression makes use of a logistic function, also known as sigmoid function:

$$ h_{{\theta^{i} }}^{i} \left( x \right) = 1/\left( {1 + e^{{ - \theta^{iT} x}} } \right) $$
(1)

In the equation above \( h_{\theta }^{i} \left( x \right) \) is the obtained probability, that the input variable \( x \) (the AE spectrum in this study) belongs to a given class \( i \) (one of the three operating conditions). \( \theta^{i} \) is the vector of parameters (frequency features in the spectrum), that the machine tunes in the learning step. For each condition i a parameter \( \theta^{i} \) is obtained in the learning process. Such process consists of an optimisation on the parameter \( \theta^{i} \) run by the machine, using the gradient descent strategy, to correctly classify the training data set. The test, instead, sees the machine providing the probability that the fed spectrum belongs to the three classes and it will assign it to the one with highest probability \( h_{\theta } \left( x \right) \).

The decision tree algorithm (Murthy 1998) has a node structure for addressing the classification problem using hierarchical, sequential binary classifications. The tree structure is built with an increasing number of nodes moving from the top of the tree towards the bottom branches. The nodes composing the tree are ranked during the learning process where the features flagged as highly informative are positioned at the top nodes of the tree; moving towards the bottom level features, where less informative scores are placed. This algorithm is widely used because of its transparent mechanism that help users to visualise how the classification algorithm takes its decision (Jang 1993). Among different decision tree algorithms (Kotsiantis 2007), one of the most frequently used due to its effectiveness and simplicity, is the Iterative Dichotomiser 3 (ID3) (Soofi and Awan 2017). This algorithm is described by a parameter called maximum number of splits, which is a control parameter of the final depth of the tree (Safavian and Landgrebe 1991): the higher this number, the larger, more detailed and complex will be the tree. Within MATLAB® Classification Learner toolbox, two decision tree algorithms were selected for test based on this control parameter: fine tree (100 maximum number of splits) and medium (20 maximum number of splits).

A different classification logic characterises the k-Nearest-Neighbour (k-NN) algorithms (Cover and Hart 1967). Each instance is assigned to a certain class based on the class to which its nearest neighbours belong. The principle is that, in the vector space defined by the number of used features, instances within a dataset will be closer to instances belonging to the same class. The definition of distance between data points was computed using the Euclidean definition (Wu and Zhang 2002), although many other methods can be used (Canberra, Chebyshev, Minkowsky etc.) (Prasath et al. 2017). One of the critical parameters defining the algorithm is k which describes the number of nearest neighbours that the algorithm considers in assigning each instance to a class. In this work, three values of k are used for the analysis: 1 (termed as fine k-NN), 10 (medium k-NN) and 100 (coarse k-NN). k-NN algorithms are widely used in applications such as face recognition (Kasemsumran et al. 2016), traffic forecasting (Zhang et al. 2013) and speaking recognition (Rizwan and Anderson 2014) achieving high performance when large training datasets are available. It is robust to noisy data and easy to visualize but it often requires large memory allocation (Soofi and Awan 2017).

Support vector machine (SVM) is considered one of the most accurate and robust algorithms among the common methods (Vapnik 2000). It is an efficient and quick method especially in the training step. Given a dataset, SVM finds the best classification function that divides the instances in two classes. The “best” function is identified geometrically and can be a hyperplane in the case of linear classification or it can have different shapes depending on the definition of the kernel function that characterises the method. Amongst others, this function may be linear, parabolic, hyperbolic. Defined the kernel function, the SVM finds the parameters within this function that maximise the margin between the classes (Suykens and Vandewalle 1999). The margin corresponds to the shortest distance between the closest points to the boundary hyperplane function. SVM is able to deal with a large variety of classification problems including high dimensional and non-linear problems (Soofi and Awan 2017). Although SVM is very powerful, it is difficult to visualise (Karamizadeh et al. 2014) and require accurate often a priori selection of a number of parameters within the kernel function.

Among the wide spectra of algorithms and kernels available in the literature, in this work some of the most established methods are used to process data-rich AE signal in order to identify the operating gas–liquid regime in two-phases and three-phases conditions. The objective is to develop a flexible method that can be extended to different systems and evaluate expected accuracy using a supervised machine learning approach.

Materials and methods

Stirred tank configuration

The agitated system, already used by Forte et al. (2019), was a cylindrical Perspex tank equipped with a gas sparger (ring shape) at the bottom of the tank. The vessel, having a diameter, T, of 0.14 m, was equipped with four Perspex baffles with width, B, equal to T/10. A stainless steel six blade Rushton Disc Turbine (RDT6) with diameter D = 0.056 m (D/T = 2/5) was used for stirring. The used liquid was an aqueous solution of Nickel nitrate hexahydrate (99.99% Sigma Aldrich®) and its level was set to 0.21 m (H/T = 3/2). In Fig. 2 a schematic of the tank is reported, together with the AE measurement device scheme.

Fig. 2
figure 2

Schematic of the stirred tank and the AE equipment

The air flow (ranging between 5 and 10 L min−1, corresponding to 1.5 and 3 vvm respectively) was fed from a ring sparger positioned at the bottom of the tank and equipped with 8 orifices of 0.5 mm. During the different experiments, the Reynolds number was kept between 15,000 and 60,000 in order to have, in all cases, turbulent regime. The impeller speed (300–1300 rpm) and the gas flow rate were changed to set one of the operating regimes (loading, complete dispersion) and then the same conditions of impeller speed were recorded with no air fed to the system (to mimic failure of gas feed unit). The achievement of the different regimes was theorised using flow regime maps and visually checked throughout the experiments, [using optical methods as per (Forte et al. 2019)]. In Table 1, operating conditions for the different regimes are reported.

Table 1 Operating conditions for the different regimes

In the three-phase experiment, stainless steel (AISI 316) spherical particles (Alpha Aesar ©) were used as solid phase. The particles, used at concentrations of 3, 4 and 5% w/w had a density ρs of approximately 8000 kg m−3 and size between 0.177 and 0.420 mm. The same operating points as for Table 1 were taken.

Acoustic emission

The used apparatus has been assembled as for similar works in the literature (Nordon et al. 2004): a piezoelectric sensor (Vallen Systeme GmbH, Icking, Germany) with resonance frequency of 375 kHz and diameter 20.3 mm, was attached to the tank using a silicone based vacuum grease to ensure acoustic coupling with the tank. A preamplifier (40 dB gain, Vallen Systeme GmbH) was also part of the measurement rig as well as a decoupling box (Vallen Systeme GmbH) that removed the electrical noise introduced by the preamplifier before feeding the signal to the oscilloscope (5243A Pico® Technology Limited), used to record data, arranged as for Fig. 1. The sensor was placed on the vessel outside wall at a height of 0.04 m from the bottom of the tank, corresponding to the impeller region. Twenty-five measurements at each combination of impeller speed and gas flow rate were taken for 0.2 s with a sampling rate of 1000 kHz, over a maximum recording capability of 20 MHz.

Data processing and machine learning

The purpose of the present work is to develop a methodology to interrogate the information from the acoustic spectrum to determine the operating regime based on a previously trained model. The acquired signals were pre-processed before they were fed to the machine learning algorithm. A schematic summary of the data processing is shown in Fig. 3.

Fig. 3
figure 3

Flow diagram summarising the data processing of AE signal

The time-domain signals were processed using the Fast Fourier Transform function (Cooley et al. 1969) to obtain the corresponding dataset in the frequency domain. Examples of time domain and frequency domain signals for loading and complete dispersion conditions are reported in Fig. 4. Each of the acquired acoustic spectra is composed by over 200,000 points, hence frequencies.

Fig. 4
figure 4

Time domain (a, b, c) and frequency domain (d, e, f) signals for ungassed (a, d), loaded (b, e) and complete dispersion conditions (c, f)

A feature scaling and mean normalisation were then applied to the obtained spectra, in order to have values at each frequency ranging from − 1 to 1, to avoid biasing the processing towards the features exhibiting the highest amplitude (Kouroussis et al. 2000).

Low frequencies are notoriously prone to noise propagation and environmental interference (Nordon et al. 2006; Whitaker et al. 2000), therefore frequencies below 4 kHz were removed from the analysis.

Implementation of machine learning algorithms requires a high number of computational operations. At the same time, high number of points may be cause of bias in the implementation of the classification algorithm. In order to reduce the interrogated features, a method based on the highest variance in the data is proposed. The frequencies in the spectrum are ranked based on their variability across the data, with the most variable ranked as 1st and so on. Based on this ranking, a number n of the most variable frequency peaks was selected as features characterising each instance before being fed to the machine. This is done to reduce the amount of processed data, as the objective of the study is to investigate AE suitability for inline real-time installation. The selection of n number of used features is investigated with nmax= 30,000 being the maximum value. The nmax value is the number for which the cumulative variance is 99.9% of the observed variability across the spectra. In Fig. 5, an average frequency domain plot is reported for the three conditions for n = nmax to show the span of used frequencies across the whole spectrum. In Fig. 5 the 30,000 frequencies are grouped in logarithmic scale and plotted using a Stem Plot, for a clearer visual representation of the large number of data.

Fig. 5
figure 5

Frequency distribution across the spectrum for ungassed (a), loaded (b) and complete dispersion (c) regime

As seen in Fig. 5, the selected frequencies in the spectra for the three conditions neglect the frequencies over 200 kHz. It must be noted that the logarithmic grouping will use a wider range for the higher frequencies, hence, the most variable features in the case of n = nmax are for frequencies in the audible range (< 20 kHz). This is more evident by increasing the dispersion of gas at the complete dispersion regime: the peaks at lower frequencies increase their intensity.

Following the feature selection step, the data are divided in three datasets for the assessment:

  • A training dataset, consisting in 60% of the acquired data, was fed to the machine for the training process, together with their own corresponding class of belonging.

  • A cross-validation dataset, consisting in 20% of the acquired data, used for selecting the optimum number of used features.

  • A test dataset, consisting in the remaining 20% of the data, unseen by the machine was used to evaluate the final accuracy of the method.

The three dataset are built taking for each operating point (fixed impeller speed and gas feed rate) the corresponding portion in percentage (Example: for a given operating point 25 AE spectrum are recorded, of which 15 are included in the training dataset, 5 in the cross-validation dataset and the remaining 5 in the test dataset).

The task is addressed as a classification problem: in the training step, the system gets as an input the spectra and the corresponding class (ungassed, loaded, completely dispersed). Depending on the used algorithm, the machine builds criteria to identify the correspondence between the AE data and their known operating condition. When the training is complete, given a new AE set of data the system will provide as an output the predicted operating condition.

The training process is carried out by obtaining acoustic signal at different impeller speed and air flow rate at the three regimes. The cross-validation step consists of identifying the optimum number of features (n) to use for the analysis. The criteria to automatically select n are dependent on two selected parameters:

  • the processing time; in this case identified as the time taken by the CPU to process the learning step, arbitrarily decided to be lower than 1 s;

  • the average accuracy of the methods; four variants are chosen from each family of the used algorithms and their accuracy is averaged for identifying the zone of optimum.

In the test step, the performance of the algorithms are calculated using two factors, one being the accuracy of the method, calculated as the ratio between number of cases correctly predicted and the total cases number; the second factor is known as the F1-score (Powers 2011). It is widely used in medical applications for diagnosis methods (Ferizi et al. 2019; Rink et al. 2011) and it takes into account positive prediction as well as false negative. It is defined as:

$$ F_{1} = 2\frac{P \cdot R}{P + R} $$
(2)

where P is the precision and R is the recall, defined, for a binary classification, in (3) and (4) respectively:

$$ P = \frac{True\;positive}{True \;positive + False\; positive} $$
(3)
$$ R = \frac{True\; positive}{True \;positive + False \;negative} $$
(4)

For each condition, the precision is the relative fraction of correctly assigned instances amongst all the cases assigned to that specific class, while recall is the fraction of correctly recognised instances over the total amount of instances belonging to the same class (Apte et al. 1994). Such definitions are valid for binary classification, when the algorithm is challenged to identify the belonging of a certain dataset on the positive or negative classes (medical diagnosis is an example). In cases of multiclassification, the F1-score is calculated for each condition and successively weighted. Therefore, in this work the comparison is made on the average weighted F1-score (Al-Salemi et al. 2018): the closer F1-score is to 1 the more reliable and precise is its prediction power.

The process of training and testing the machine learning algorithm is repeated for ten times, where for each repetition, the datasets included in the training and testing are randomly selected. This is done to avoid bias in evaluating the algorithm accuracy that may be achieved by selecting a “lucky” training dataset. The reported results are the average values obtained in the ten repletion steps.

In this study, two test case datasets were acquired, one at the same conditions of gas flow rate and impeller speed with the bi-phasic mixture, and a second one in which solids are added to the mixture.

Results and discussion

Choice of input parameters

One of the potential issues for the application of AE as a real time tool is the computational time; it is essential that the analysis can be carried out in short time to evaluate the operating condition instantaneously. In this study, the parameters used by the algorithms to recognise the mixing regime are the spectrum points: the amplitude of each frequency present in the spectrum.

However, AE are usually acquired at high frequency resulting in a large amount of data; this can cause delay in processing and at the same time represents a challenge for the algorithms to avoid biasing and overfitting. For this reason, an initial study is carried out by manipulating n, the number of used features, in the spectrum for the classification. After ranking the frequency spectrum based on decreasing variance, a variable number, n, of frequencies is fed to the training process of the different algorithms. The obtained methods are then challenged to correctly classify the cross-validation dataset; as done for the training dataset, the corresponding n of the cross-validation spectra are used for this optimisation step. For this initial test, carried out on the biphasic case, one algorithm of each family is used for the analysis. The objective is to identify the optimal number of features that would represent a good compromise of obtained accuracy and processing time for the different algorithms. The results, in terms of accuracy, are shown in Fig. 6.

Fig. 6
figure 6

Accuracy over the cross-validation dataset varying the number of n number of features for logistic regression, decision tree (fine), SVM (with linear kernel) and k-NN (with k = 10) in the gas–liquid case

Figure 6 highlights how critical the choice of the parameter \( n \) is in ensuring good prediction performance. While the SVM starts already with performance around 75–80% even with a small number of features (n = 100) with a local maximum in the range between 1000 and 3000 features, the other three algorithms perform significantly worse at low number of features. SVM Their accuracy progressively increase with increasing n, until the same range of SVM maximum is reached. In this condition all performance, except the decision tree, are in the same order of SVM. By increasing the number of features over 10,000, all the algorithms, except the decision tree, perform worse. This is due to the fact that when increasing the vector space dimension, the algorithms may be misguided by unimportant parts of the spectrum that tend to “homogenise” the acquired data at different conditions. The decision tree is less affected mainly because it is not based on geometrical parameters. The reason for SVM performing better than the others even at low number of features can be found in the data structure in the investigated vector space. The three classes can be differentiated by the SVM linear kernel with an accuracy around 80% at low number of features. Training time is reported in Table 2 for the different data points for the three methods built in Matlab® Classification learner (the logistic regression algorithm was not considered in this comparison because it was manually coded by the authors).

Table 2 Training time for fine tree, SVM (linear kernel), k-NN and the average value for the two-phase case

When n is increased over 1000, the processing time takes longer than 1 s; moreover, n = 1000 corresponds to the maximum region of three of the four tested methods (Fig. 6), therefore, this is identified as the operating number of features for further analysis.

Although the optimisation is computed for the two-phase case, the training time for the three-phase case stays under 1 s as reported in Table 3.

Table 3 Training time for fine tree, SVM (linear kernel), k-NN and the average value for the three-phase case

Two-phase test

The training was carried out for the chosen number of features based on the training dataset acquired for the biphasic condition. The test dataset, unseen by the machine in previous step was used to obtain the prediction performance of the set of chosen algorithms. Results in terms of accuracy for each algorithm at different regimes are reported in Table 4.

Table 4 Average accuracy of the machine learning algorithms for the different regimes in the gas–liquid case

In terms of accuracy, all the methods exception made for the Coarse KNN have accuracy over 84%. Among the tested algorithms, the SVM with quadratic and cubic kernels seem to outperform the linear based SVM and the other algorithms, with accuracy equal to or higher than 90% in the three investigated conditions. Logistic regression also presents a high accuracy compared to other methods, while it is interesting to observe that k-NN performance inflects as the number of considered k neighbours is increased from 1 (fine k-NN) to 100 (coarse k-NN). In cases of classification problems, accuracy is often not enough to correctly describe the performance of the algorithm, therefore, for further comparison a weighted F1-score (Powers 2011) is calculated for the different algorithms as an average among the 10 repetitions. It is showed in Table 5.

Table 5 Weighted F1 score obtained in the gas–liquid regime prediction case

Quadratic and cubic SVM achieve the highest result in F1-score, confirming what has already been observed from the accuracy comparison. By way of example, a parity plot from one of the repetition, showing the correct predictions and the failures of the SVM cubic algorithm is reported in Fig. 7. The points on the identity line identify the correct assignments, while the misclassifications are in one of the other regions of the plot. Data show one of the repetitions (the median dataset) reported in Table 4.

Fig. 7
figure 7

Parity plot reporting SVM quadratic results in classifying the different test datasets in the biphasic case

All the ungassed cases are correctly identified by the algorithm, while three datasets for both loading and complete dispersion are misclassified. The occurrence of similar misclassifications can be due both to limitations of the model, but at the same time also to temporary oscillations of the mixing condition between the two regimes. Indeed, the three points belong to the points are at the edge between the two regimes, respectively N = 600 rpm for loading and N = 800 rpm for complete dispersion (both at a gas feed rate of 1.5 vvm).

Three-phase test

In the second part of the study, stainless-steel particles are added at different concentration and the acoustic signal is acquired at the three regimes. The same algorithms used for the biphasic system are applied for assessing the ML capability in this case, maintaining unchanged also n, the number of used features. The obtained results in terms of overall accuracy and F1-score are summarised in Table 6.

Table 6 Overall accuracy and weighted F1-score for the different algorithms in gas–solid–liquid regime prediction

In the three-phase cases the obtained performance of all algorithms are high, with fine and medium k-NN achieving comparable performance to the cubic and quadratic SVM, (still being the highest performing algorithms) and the logistic regression algorithms. Figure 8 shows the parity plot obtained in the three-phase case. Data show one of the repetitions (the median dataset) reported in Table 6.

Fig. 8
figure 8

Parity plot reporting SVM quadratic results in classifying the different signals in the gas–solid–liquid test

This result confirms that as the presence of solids, at low concentration, does not have major effects on the gas–liquid sparging dynamics within the tank, in similar fashion it does not significantly affect the ability of the ML tools to infer acoustic emission features to characterise the bi-phasic mixture. The algorithms are able to be trained to recognise the acoustic emission caused by the bubbles presence and interaction with the other phases and therefore it is possible for the system to identify the corresponding regime.

Dynamic condition

To further challenge the technique, and in order to verify its flexibility over deviations from an initial well-defined condition, an additional test was run. The trained algorithm with the closest performance to the median in the gas–liquid condition is challenged to recognise the corresponding regimes in the three-phase condition. Such a situation is realistic in cases in which a system is characterised in a certain starting bi-phasic condition and then during the operation dynamic, a third phase is generated by reaction or addition to the mixture. Therefore, the acquired data used for the three-phase experiment are fed to the biphasic algorithm. The results, reported in Table 7, show the significant decrease in accuracy and F1-score exhibited by all the algorithms except the logistic regression. The influence of the solids in the acoustic spectrum indeed does not allow the system to recognise the correct gas–liquid regime. This represents a limit to the flexibility of the geometrical based algorithms (SVM and k-NN) and the decision tree that clearly do not succeed in correlating characteristic features in the biphasic case with corresponding ones in the three-phase case. The logistic regression instead is able to present performances similar to the ones obtained in the previous two cases. The reason is related to the different nature of this algorithm: in each case a probability of belonging of the instance to the three different cases is calculated independently. Although all these probabilities decrease, the correct one outranks the others and therefore the system is able to correctly predict the gas–liquid condition.

Table 7 Overall accuracy and weighted F1-score for the different algorithms in gas–solid–liquid regime prediction using the algorithms trained in the gas–liquid condition

Figure 9 reports the parity plot for the best case (logistic regression), and does not show any particular clustered dataset presenting fallacious behaviour. Amongst the processed data 91.2% are correctly recognised by the ML.

Fig. 9
figure 9

Parity plot reporting logistic regression results in classifying the different signals in the gas–solid–liquid test using the gas–liquid trained algorithms

The logistic regression, already well-performing in the other cases demonstrates higher flexibility than the other machine learning algorithms investigated in this study. This can be certainly related to the system used and the choice of data-processing algorithm remains an important issue to be addressed. Deviations from ideal conditions is an every-day challenge on process plants and the ability to adapt to non-forecasted conditions is a critical characteristic for an implementable measurement technique.

Conclusions

In this work, a methodology to use Acoustic Emission data in combination with machine learning algorithms, to identify gas–liquid mixing conditions in a stirred tank was proposed. The machine was trained to recognise ungassed, loading and recirculation condition using a training dataset, consisting in the acoustic spectrum acquired at the different regimes in two-phase (gas–liquid) and three-phase (gas–solid–liquid) stirred tank. The training procedure was repeated ten times for each algorithm by varying the data points in the leaning dataset. Average results show that with appropriate training, some of the investigated algorithms (SVM with quadratic and cubic kernel and logistic regression) were able to correctly recognise the regimes corresponding to the tested spectra with accuracy higher than 90%.

The system was then challenged to make prediction on the gas sparging regime in the three-phase runs while using the learning gained in the bi-phasic condition to evaluate flexibility of the method in dynamic conditions (as for example in precipitation batch processes). Most of the algorithms did not successfully address this task, presenting accuracy in some cases significantly lower than 50%; however logistic regression, above all, was the one that is able to perform at the same level as for previous tests (> 90%).

The conducted study aimed to investigate AE as a potential diagnostic and condition monitoring technique for fluid mixing applications in combination with the use of machine learning algorithms and proposes a methodology to evaluate the most appropriate algorithms. Although amongst the wide range of algorithms available in the literature, only some were taken into consideration, from this investigation it is possible to infer that a combined use of AE and ML could represent a powerful tool for in situ monitoring of two-phase and three-phase mixing.