Data-driven forward osmosis model development using multiple linear regression and artificial neural networks
Graphical abstract
Introduction
Forward osmosis (FO) is a concentration-driven separation technology which finds its application in seawater desalination, wastewater treatment, and concentration of liquid foods (Haupt and Lerch, 2018). The driving force is the osmotic pressure of a typically salty solution, called draw solution (DS). The DS makes water from the feed solution (FS) permeate through a semipermeable membrane. Thus, FO does not require pressure or heat.
Dairy processes must adhere to high hygienic standards and require multiple purification and separation stages to be performed, which leads to high energy demands. One of these processes is the treatment of whey, which is a byproduct of cheese production remaining after the separation of casein and fat. In the past, whey was treated as a waste product, used exclusively for animal feeding. Modern membrane technologies offer the opportunity to produce valuable compounds from this byproduct. A typical use of whey is the production of whey protein concentrate by ultrafiltration. Whey-permeate remains as a byproduct and can be used as a substrate for lactose production. Conventional lactose production is based on the supersaturation of a lactose solution, followed by selective crystallization (McSweeney and Fox, 2009). Hence, whey-permeate needs to be concentrated. Common concentration technologies are nanofiltration, reverse osmosis (RO), and evaporation. FO offers an alternative approach to concentrating lactose permeate, with several advantages, including low energy consumption, the use of renewable energy, low irreversible fouling potential, and high concentration factors (Haupt and Lerch, 2018).
The determination of permeate flux and reverse solute flux (RSF) is common practice as evaluation for the eligibility of FO. The development of the transport-based system models is exceptionally complex due to the presence of two interacting fluids with non-stationary characteristics (e.g. concentration, velocity, viscosity) as well as the asymmetric structure of thin film composite membranes.
Furthermore, the FS used in this study is supersaturated and thus shows metastable characteristics. This high complexity requires extensive knowledge of the present FO process. The development of a thermodynamic model capable of predicting process parameters with a sufficient amount of accuracy would require extensive details on solute transfer, operational parameters, thermodynamic properties of FS and DS as well as membrane architecture. Therefore, the prediction of FO performance by mechanistic models reaches its limit in industrial use-cases like whey-permeate concentration and is commonly focused on pure water or synthetic salt solutions for FS and DS.
Several experimental studies examined the feasibility of whey concentration by FO (Aydiner et al., 2013; Chen et al., 2019; Menchik and Moraru, 2019; Seker et al., 2017; Wang et al., 2017). Sodium chloride solutions, mostly used as DS in these experiments, led to maximum whey concentrations ranging from 14% to 18% (Aydiner et al., 2013; Chen et al., 2019; Wang et al., 2017), although the authors had estimated a total dry matter content of 25% to 35% should be attainable (Aydiner et al., 2013). Furthermore, Menchik and Moraru (2019) investigated the concentration of acid whey using potassium lactate as DS, and enriched the whey's dry matter content to about 40 Brix. Seker et al. (2017) studied the concentration of whey by means of adding ammonium carbonate as DS.
The aforementioned studies all concentrated pure whey by FO, whereas this study focuses on the concentration of whey-permeate (i.e. ultrafiltrated whey). Furthermore, the whey-permeate is enriched to a dry matter content exceeding the level of previous studies. This high dry matter content simultaneously implies the limit of lactose solubility of the solution being exceeded by far. Thus, a complex FS with metastable thermodynamic characteristics is generated.
A number of theoretical transport models have been developed to calculate permeate flux and RSF in FO. McCutcheon et al. and Gray et al. devised detailed mass transfer models considering internal and external concentration polarization and membrane orientation using deionized water and sodium chloride solutions as FS and DS (McCutcheon and Elimelech, 2006, 2007). These models offer a more in-depth understanding of the fundamental mass transfer in FO and give important indications for the direction of future developments in membrane fabrication. Bui et al. (2015) expanded the models of internal and external concentration polarization to calculate the structural parameter of FO membranes, which gives an idea of the mechanical structure of the membranes’ support layers. Qasim et al. (2015) summarized various state-of-the-art flux models for FO applications in desalination processes. Furthermore, various studies examined the behavior of membrane modules (Goda and Sekino, 2020; Phuntsho et al., 2014; You et al., 2012).
These modeling approaches all share a requirement for detailed substance data and thus predominantly focus on synthetic FSs and DSs with well-known characteristics (e.g. H2O + NaCl). For complex systems like supersaturated whey-permeate, a data-driven approach using artificial neural networks (ANN) or multiple linear regression (MLR) could result in more suitable to predict FO performance while avoiding thermodynamic complexities.
Since FO is a more recently developed technology, the scientific discourse about the application of ANN to various membrane processes primarily focuses on RO and microfiltration (Chen and Kim, 2006; Dornier et al., 1995; Ghandehari et al., 2011; Lee et al., 2009; Libotean et al., 2009). Most such studies focus on optimization of process performance in membrane processes, especially in seawater desalination plants. ANNs have been observed to predict water flux as well as the permeate's total dissolved solids (TDS) with very promising results (permeate flux: 0.75 (Lee et al., 2009) ≤ R²test ≤ 0.993 (Ghandehari et al., 2011); TDS: R²test = 0.96 (Lee et al., 2009)). ANN obtains more accurate results compared to transport-based system models (e.g. intermediate blocking (Ghandehari et al., 2011), standard blocking (Ghandehari et al., 2011) and multiple linear regression (MLR) (Chen and Kim, 2006)). Chen et al. found that the application of a radial basis function neural network (RBFNN) is more accurate than the more commonly used backpropagation neural network (BPNN) for predicting permeate flux decline (Chen and Kim, 2006). However, not many other studies comparing the performances of different kinds of ANNs have been conducted. The same publication also found any kind of ANN to be clearly superior to more traditional methods of statistical analysis (i.e. MLR).
Due to the limited amount of research on FO, the potential of ANNs for process modeling was only examined in a few studies. Pardeshi et al. applied a Taguchi-Neural approach to model the maximum reverse solute flux selectivity (RSFS) of a lab-scale FO module (Pardeshi et al., 2016). The primary objective was to define the parameter levels (temperature, velocity) required for an optimum operation of the setup. For this reason, the temperature and velocity of the FS and DS are the only two parameters that were considered significant for model development. Aghilesh et al. developed and compared FO models to predict water flux and RSF in textile wastewater treatment using ANN, response surface methodology (RSM) and adaptive neuro-fuzzy interference system (ANFIS) (Aghilesh et al., 2021). Comparing those models, RSM predicted water flux with the greatest accuracy (R² = 0.8529), while ANFIS was most capable of predicting RSF (R² = 0.9427).
A study by Jawad et al. applied a multilayer feed forward neural network (FNN) to 709 data points that were extracted from several studies conducting lab-scale FO experiments (Jawad et al., 2020). Nine different parameters (e.g. FS/DS concentration, FS/DS temperature, membrane type, crossflow velocity, type of DS) were used to predict the permeate flux of a typical lab-scale FO module. The study showed promising results for ANN (R²test = 0.821), especially in comparison to a model using MLR (R²train = 0.516). Furthermore, Jawad et al. developed an ANN to predict permeate flux in lab-scale FO experiments with NaCl solutions and deionized water (Jawad et al., 2021). ANN was able to predict water flux precisely (R²test = 0.9978). Subsequently, specific weights of ANN were analyzed to assess the relative importance of different FO parameters (sensitivity analysis). Finally, RSM was used to determine the optimum operating conditions.
Nevertheless, all aforementioned publications focus on FSs, which is a much less complex issue to handle within FO processing at relatively low concentrations. Besides using a data-driven modeling approach, these processes could also be modeled using common thermodynamic approaches if the physical properties of the FSs and the DSs are known.
In this study, the concentration of whey-permeate is investigated up to concentrations exceeding lactose solubility. Thus, the FS is a metastable system with unknown physical properties, which makes detailed thermodynamic modeling particularly complex. As the literature review shows, the capability of data-driven modeling of the concentration process of a metastable solution within FO has yet to be investigated.
The objective of this study is to examine the capability of data-driven modeling technologies (MLR and ANN) to model permeate flux in a highly complex FO application, based on internally produced experimental data. All models are evaluated via training and test data to compare their suitability for both known and unknown data. First, MLR and ANN are developed using all available inputs. Subsequently, the significance of the input variables is assessed and used to reduce MLR's and ANN's complexity before simplified versions of the MLR and ANN models are evaluated and compared to each other. To demonstrate the capabilities of the models, the best fitting model is applied to examine the influence of temperature and FS inlet flow on the process for systems with differing membrane area. Applied in this manner, data-driven models help with optimizing the process conditions of FO processes. Furthermore, the models can support the monitoring of process performance and may allow for a faster response to performance fluctuations.
Section snippets
Experimental design
The experimental dataset comprises 156 experimental data points at different operating conditions (Supplementary Material). To evaluate the effect of the process conditions on FO performance, whey-permeate was concentrated by two FO pilot plants, similar in their designs (Figs. A1, A2). The experimental setup was separated into two stages.
The first stage focused on the influence of the basic process parameters, such as temperature, FS concentration, DS concentration, FS inlet flow rate and DS
Calculation
In this study, the calculation part was split in four sequential sub-parts. In a first step, an MLR model was developed considering all available independent variables from the training dataset. In the development of MLR models, all data is typically used to derive the most accurate regression coefficients. However, as this study focuses on comparing the capability of MLR and ANN to predict the permeate flux of an FO process from unknown inputs, a split of testing data is required for both
Results and discussion
Chapter 3 demonstrated the structured model development, validation and simplification. In total, all models developed in the previous chapter are able to describe the process well (lowest R²All = 0.9587). Table 5 gives an indication of the degrees of determination of all models considered.
Neither MLR nor ANN show significantly lower coefficients of determination for the reduced models when compared to the unreduced versions. Hence, the following discussion will focus on the reduced models.
Conclusion
In this study, an ANN as well as an MLR were developed to model the permeate flux in a highly complex FO process. The process aims on the production of highly supersaturated whey-permeate solutions which can be used as a substrate for lactose production.
Both models were created using all available input values as origin and then systematically simplified by eliminating insignificant process parameters. The study shows that both modeling approaches are capable of modeling permeate flux in the FO
CRediT authorship contribution statement
Lukas Gosmann: Conceptualization, Methodology, Software, Validation, Formal analysis, Investigation, Writing – original draft, Visualization, Project administration. Christian Geitner: Methodology, Software, Formal analysis, Writing – original draft, Visualization. Nora Wieler: Methodology, Software, Validation, Formal analysis, Writing – original draft, Visualization.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
References (35)
- et al.
Artificial neural network model for predicting drill cuttings settling velocity
Petroleum
(2020) - et al.
Proper accounting of mass transfer resistances in forward osmosis: improving the accuracy of model predictions of structural parameter
J. Membr. Sci.
(2015) - et al.
A pilot scale study on the concentration of milk and whey by forward osmosis
Sep. Purif. Technol.
(2019) - et al.
Prediction of permeate flux decline in crossflow membrane filtration of colloidal suspension: a radial basis function neural network approach
Desalination
(2006) - et al.
Dynamic modeling of crossflow microfiltration using neural networks
J. Membr. Sci.
(1995) - et al.
A comparison between semi-theoretical and empirical modeling of cross-flow microfiltration using ANN
Desalination
(2011) - et al.
Artificial neural network model for optimizing operation of a seawater reverse osmosis desalination plant
Desalination
(2009) - et al.
Neural network approach for modeling the performance of reverse osmosis membrane desalting
J. Membr. Sci.
(2009) - et al.
Influence of concentrative and dilutive internal concentration polarization on flux behavior in forward osmosis
J. Membr. Sci.
(2006) - et al.
Nonthermal concentration of liquid foods by a combination of reverse osmosis and forward osmosis. Acid whey: a case study
J. Food Eng.
(2019)
Determination of optimum conditions in forward osmosis using a combined Taguchi–neural approach
Chem. Eng. Res. Des.
Osmotic equilibrium in the forward osmosis process: modelling, experiments and implications for process performance
J. Membr. Sci.
Water desalination by forward (direct) osmosis phenomenon: a comprehensive review
Desalination
Effect of process parameters on flux for whey concentration with NH3 /CO2 in forward osmosis
Food Bioprod. Process.
Assessment of critical parameters for artificial neural networks based short-term wind generation forecasting
Renew. Energy
Whey recovery using forward osmosis—evaluating the factors limiting the flux performance
J. Membr. Sci.
Temperature as a factor affecting transmembrane water flux in forward osmosis: steady-state modeling and experimental validation
Chem. Eng. J.
Cited by (9)
Development of machine learning techniques in corrosion inhibition evaluation of 5-methyl-1 H-benzotriazole on N80 steel in acidic media
2023, Materials Today CommunicationsMulti-timescale capacity configuration optimization of energy storage equipment in power plant-carbon capture system
2023, Applied Thermal EngineeringOn the evaluating membrane flux of forward osmosis systems: Data assessment and advanced intelligent modeling
2024, Water Environment ResearchApplication and Progress of Machine Learning in Pesticide Hazard and Risk Assessment
2024, Medicinal ChemistryConcept and development of IoT-based e-maintenance platform for demonstrated system
2024, International Journal on Interactive Design and Manufacturing