Data-driven forward osmosis model development using multiple linear regression and artificial neural networks

https://doi.org/10.1016/j.compchemeng.2022.107933Get rights and content

Highlights

  • Forward osmosis data-driven modeling for a highly complex, metastable feed solution based on experimental data from whey-permeate concentration.

  • Excellent R² value of 0.9849 for test data.

  • Enhancements to evaluation of process performance for industrial use cases.

Abstract

This work investigates the capability of multiple linear regression (MLR) and artificial neural networks (ANN) to model permeate flux in a thermodynamically complex forward osmosis (FO) process. Whey-permeate was concentrated to a dry matter content of more than 55 %, creating a highly supersaturated metastable solution and exceeding the established boundaries of conventional membrane technology. Different ANN architectures were trained and tested with a varying number of hidden layers and neurons to find an accurate structure.

Furthermore, the evaluated significance of the input parameters was used to reduce the model's complexity. This work shows that both approaches (MLR: R²test = 0.9718, ANN: R²test = 0.9849) were able to model the FO's permeate flux accurately, even with a reduced number of inputs. Finally, due to its slightly better performance, the ANN was used to outline the influence of FS inlet flow and process temperature.

Introduction

Forward osmosis (FO) is a concentration-driven separation technology which finds its application in seawater desalination, wastewater treatment, and concentration of liquid foods (Haupt and Lerch, 2018). The driving force is the osmotic pressure of a typically salty solution, called draw solution (DS). The DS makes water from the feed solution (FS) permeate through a semipermeable membrane. Thus, FO does not require pressure or heat.

Dairy processes must adhere to high hygienic standards and require multiple purification and separation stages to be performed, which leads to high energy demands. One of these processes is the treatment of whey, which is a byproduct of cheese production remaining after the separation of casein and fat. In the past, whey was treated as a waste product, used exclusively for animal feeding. Modern membrane technologies offer the opportunity to produce valuable compounds from this byproduct. A typical use of whey is the production of whey protein concentrate by ultrafiltration. Whey-permeate remains as a byproduct and can be used as a substrate for lactose production. Conventional lactose production is based on the supersaturation of a lactose solution, followed by selective crystallization (McSweeney and Fox, 2009). Hence, whey-permeate needs to be concentrated. Common concentration technologies are nanofiltration, reverse osmosis (RO), and evaporation. FO offers an alternative approach to concentrating lactose permeate, with several advantages, including low energy consumption, the use of renewable energy, low irreversible fouling potential, and high concentration factors (Haupt and Lerch, 2018).

The determination of permeate flux and reverse solute flux (RSF) is common practice as evaluation for the eligibility of FO. The development of the transport-based system models is exceptionally complex due to the presence of two interacting fluids with non-stationary characteristics (e.g. concentration, velocity, viscosity) as well as the asymmetric structure of thin film composite membranes.

Furthermore, the FS used in this study is supersaturated and thus shows metastable characteristics. This high complexity requires extensive knowledge of the present FO process. The development of a thermodynamic model capable of predicting process parameters with a sufficient amount of accuracy would require extensive details on solute transfer, operational parameters, thermodynamic properties of FS and DS as well as membrane architecture. Therefore, the prediction of FO performance by mechanistic models reaches its limit in industrial use-cases like whey-permeate concentration and is commonly focused on pure water or synthetic salt solutions for FS and DS.

Several experimental studies examined the feasibility of whey concentration by FO (Aydiner et al., 2013; Chen et al., 2019; Menchik and Moraru, 2019; Seker et al., 2017; Wang et al., 2017). Sodium chloride solutions, mostly used as DS in these experiments, led to maximum whey concentrations ranging from 14% to 18% (Aydiner et al., 2013; Chen et al., 2019; Wang et al., 2017), although the authors had estimated a total dry matter content of 25% to 35% should be attainable (Aydiner et al., 2013). Furthermore, Menchik and Moraru (2019) investigated the concentration of acid whey using potassium lactate as DS, and enriched the whey's dry matter content to about 40 Brix. Seker et al. (2017) studied the concentration of whey by means of adding ammonium carbonate as DS.

The aforementioned studies all concentrated pure whey by FO, whereas this study focuses on the concentration of whey-permeate (i.e. ultrafiltrated whey). Furthermore, the whey-permeate is enriched to a dry matter content exceeding the level of previous studies. This high dry matter content simultaneously implies the limit of lactose solubility of the solution being exceeded by far. Thus, a complex FS with metastable thermodynamic characteristics is generated.

A number of theoretical transport models have been developed to calculate permeate flux and RSF in FO. McCutcheon et al. and Gray et al. devised detailed mass transfer models considering internal and external concentration polarization and membrane orientation using deionized water and sodium chloride solutions as FS and DS (McCutcheon and Elimelech, 2006, 2007). These models offer a more in-depth understanding of the fundamental mass transfer in FO and give important indications for the direction of future developments in membrane fabrication. Bui et al. (2015) expanded the models of internal and external concentration polarization to calculate the structural parameter of FO membranes, which gives an idea of the mechanical structure of the membranes’ support layers. Qasim et al. (2015) summarized various state-of-the-art flux models for FO applications in desalination processes. Furthermore, various studies examined the behavior of membrane modules (Goda and Sekino, 2020; Phuntsho et al., 2014; You et al., 2012).

These modeling approaches all share a requirement for detailed substance data and thus predominantly focus on synthetic FSs and DSs with well-known characteristics (e.g. H2O + NaCl). For complex systems like supersaturated whey-permeate, a data-driven approach using artificial neural networks (ANN) or multiple linear regression (MLR) could result in more suitable to predict FO performance while avoiding thermodynamic complexities.

Since FO is a more recently developed technology, the scientific discourse about the application of ANN to various membrane processes primarily focuses on RO and microfiltration (Chen and Kim, 2006; Dornier et al., 1995; Ghandehari et al., 2011; Lee et al., 2009; Libotean et al., 2009). Most such studies focus on optimization of process performance in membrane processes, especially in seawater desalination plants. ANNs have been observed to predict water flux as well as the permeate's total dissolved solids (TDS) with very promising results (permeate flux: 0.75 (Lee et al., 2009) ≤ R²test ≤ 0.993 (Ghandehari et al., 2011); TDS: R²test = 0.96 (Lee et al., 2009)). ANN obtains more accurate results compared to transport-based system models (e.g. intermediate blocking (Ghandehari et al., 2011), standard blocking (Ghandehari et al., 2011) and multiple linear regression (MLR) (Chen and Kim, 2006)). Chen et al. found that the application of a radial basis function neural network (RBFNN) is more accurate than the more commonly used backpropagation neural network (BPNN) for predicting permeate flux decline (Chen and Kim, 2006). However, not many other studies comparing the performances of different kinds of ANNs have been conducted. The same publication also found any kind of ANN to be clearly superior to more traditional methods of statistical analysis (i.e. MLR).

Due to the limited amount of research on FO, the potential of ANNs for process modeling was only examined in a few studies. Pardeshi et al. applied a Taguchi-Neural approach to model the maximum reverse solute flux selectivity (RSFS) of a lab-scale FO module (Pardeshi et al., 2016). The primary objective was to define the parameter levels (temperature, velocity) required for an optimum operation of the setup. For this reason, the temperature and velocity of the FS and DS are the only two parameters that were considered significant for model development. Aghilesh et al. developed and compared FO models to predict water flux and RSF in textile wastewater treatment using ANN, response surface methodology (RSM) and adaptive neuro-fuzzy interference system (ANFIS) (Aghilesh et al., 2021). Comparing those models, RSM predicted water flux with the greatest accuracy (R² = 0.8529), while ANFIS was most capable of predicting RSF (R² = 0.9427).

A study by Jawad et al. applied a multilayer feed forward neural network (FNN) to 709 data points that were extracted from several studies conducting lab-scale FO experiments (Jawad et al., 2020). Nine different parameters (e.g. FS/DS concentration, FS/DS temperature, membrane type, crossflow velocity, type of DS) were used to predict the permeate flux of a typical lab-scale FO module. The study showed promising results for ANN (R²test = 0.821), especially in comparison to a model using MLR (R²train = 0.516). Furthermore, Jawad et al. developed an ANN to predict permeate flux in lab-scale FO experiments with NaCl solutions and deionized water (Jawad et al., 2021). ANN was able to predict water flux precisely (R²test = 0.9978). Subsequently, specific weights of ANN were analyzed to assess the relative importance of different FO parameters (sensitivity analysis). Finally, RSM was used to determine the optimum operating conditions.

Nevertheless, all aforementioned publications focus on FSs, which is a much less complex issue to handle within FO processing at relatively low concentrations. Besides using a data-driven modeling approach, these processes could also be modeled using common thermodynamic approaches if the physical properties of the FSs and the DSs are known.

In this study, the concentration of whey-permeate is investigated up to concentrations exceeding lactose solubility. Thus, the FS is a metastable system with unknown physical properties, which makes detailed thermodynamic modeling particularly complex. As the literature review shows, the capability of data-driven modeling of the concentration process of a metastable solution within FO has yet to be investigated.

The objective of this study is to examine the capability of data-driven modeling technologies (MLR and ANN) to model permeate flux in a highly complex FO application, based on internally produced experimental data. All models are evaluated via training and test data to compare their suitability for both known and unknown data. First, MLR and ANN are developed using all available inputs. Subsequently, the significance of the input variables is assessed and used to reduce MLR's and ANN's complexity before simplified versions of the MLR and ANN models are evaluated and compared to each other. To demonstrate the capabilities of the models, the best fitting model is applied to examine the influence of temperature and FS inlet flow on the process for systems with differing membrane area. Applied in this manner, data-driven models help with optimizing the process conditions of FO processes. Furthermore, the models can support the monitoring of process performance and may allow for a faster response to performance fluctuations.

Section snippets

Experimental design

The experimental dataset comprises 156 experimental data points at different operating conditions (Supplementary Material). To evaluate the effect of the process conditions on FO performance, whey-permeate was concentrated by two FO pilot plants, similar in their designs (Figs. A1, A2). The experimental setup was separated into two stages.

The first stage focused on the influence of the basic process parameters, such as temperature, FS concentration, DS concentration, FS inlet flow rate and DS

Calculation

In this study, the calculation part was split in four sequential sub-parts. In a first step, an MLR model was developed considering all available independent variables from the training dataset. In the development of MLR models, all data is typically used to derive the most accurate regression coefficients. However, as this study focuses on comparing the capability of MLR and ANN to predict the permeate flux of an FO process from unknown inputs, a split of testing data is required for both

Results and discussion

Chapter 3 demonstrated the structured model development, validation and simplification. In total, all models developed in the previous chapter are able to describe the process well (lowest R²All = 0.9587). Table 5 gives an indication of the degrees of determination of all models considered.

Neither MLR nor ANN show significantly lower coefficients of determination for the reduced models when compared to the unreduced versions. Hence, the following discussion will focus on the reduced models.

Conclusion

In this study, an ANN as well as an MLR were developed to model the permeate flux in a highly complex FO process. The process aims on the production of highly supersaturated whey-permeate solutions which can be used as a substrate for lactose production.

Both models were created using all available input values as origin and then systematically simplified by eliminating insignificant process parameters. The study shows that both modeling approaches are capable of modeling permeate flux in the FO

CRediT authorship contribution statement

Lukas Gosmann: Conceptualization, Methodology, Software, Validation, Formal analysis, Investigation, Writing – original draft, Visualization, Project administration. Christian Geitner: Methodology, Software, Formal analysis, Writing – original draft, Visualization. Nora Wieler: Methodology, Software, Validation, Formal analysis, Writing – original draft, Visualization.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References (35)

Cited by (9)

  • Concept and development of IoT-based e-maintenance platform for demonstrated system

    2024, International Journal on Interactive Design and Manufacturing
View all citing articles on Scopus
View full text