A comparative study of multi-objective optimization methodologies for molecular and process design
Introduction
The multiple ways in which the selection of a molecule can impact performance have been well documented through the computer aided molecular design (CAMD) and computer aided molecular and process design (CAMPD) literature. This is reflected in the diversity of objective functions considered: property metrics (e.g. Pretel, López, Bottini, Brignole, 1994, Gani, Jimenez-Gonzalez, ten Kate, Crafts, Jones, Powell, Atherton, Cordiner, 2006, Folić, Adjiman, Pistikopoulos, 2008, Austin, Sahinidis, Konstantinov, Trahan, 2018), economics (e.g. Pereira, Keskes, Galindo, Jackson, Adjiman, 2011, Gopinath, Jackson, Galindo, Adjiman, 2016, Ahmad, Hashim, Mustaffa, Maarof, Yunus, 2018), productivity metrics (e.g. Bardow, Steur, Gross, 2010, Bowskill et al., 2020, Cignitti, Andreasen, Haglind, Woodley, Abildskov, 2017, Eden, Jørgensen, Gani, El-Halwagi, 2004, Zhang, Qin, Peng, Zhou, Cheng, Chen, Qi, 2017), or environmental and safety criteria (e.g. Duvedi, Achenie, et al., 1996, Pistikopoulos, Stefanis, 1998, Hostrup, Harper, Gani, 1999, Song, Song, 2008, Papadopoulos, Stijepovic, Linke, Seferlis, Voutetakis, 2013, Khor, Liam, Loh, Tan, Ng, Hassim, Ng, Chemmangattuvalappil, 2017). An in-depth description of these is beyond the scope of our current paper and the reader is referred to Gopinath et al. (2016) and Ng et al. (2015) and the references therein for further details.
The diversity of objectives used highlights that it is beneficial in many cases to consider multiple conflicting objectives—for example, property targets, sustainability targets, societal impact, and economic performance—that cannot easily be combined together in a single metric. Multi-objective optimization (MOO) is thus receiving increasing attention in the area of CAM(P)D. In principle, for a problem with continuous decision variables, the presence of conflicting objectives results in an infinite number of optimal solutions, commonly known as Pareto-frontier solutions. Usually, it is not possible to derive an analytical description of the Pareto frontier (Deb, 2001). Hence, in practical applications, the Pareto frontier is approximated by a finite number of Pareto-optimal solutions (Marler and Arora, 2004).
In a nutshell, the main idea in the development of MOO algorithms is to (1) find non-dominated points that can represent the Pareto-frontier in reasonable computational time; and (2) generate these points so that they are distributed evenly along the Pareto front. The main MOO methods that have been used in molecular design include scalarization methods (e.g. weighted sum (WS) method, sandwich (SD) algorithm), ϵ-constraint methods, and metaheuristic methods. Papadopoulos and Linke (2006) proposed a multi-objective molecular design technique linked with a process synthesis framework by using the weighted sum method, extending it to the design of binary working fluid mixtures in Organic Rankine Cycles (ORC) (Papadopoulos et al., 2010) and to the design of solvents for CO2 capture (Papadopoulos, Badr, Chremos, Forte, Zarogiannis, Seferlis, Papadokonstantakis, Galindo, Jackson, Adjiman, 2016, Papadopoulos et al., 2020). The authors adopted the simulated annealing (SA) algorithm proposed by Marcoulaki and Kokossis (2000) to explore the design space directly. Burger et al. (2015) utilized the SD (Bortz et al., 2014) within their mixed-integer nonlinear programming (MINLP) solution strategy to design a solvent for a CO2 physical absorption process, avoiding the difficulty in assigning weight vectors. The solutions generated by MOO were used as starting points for the solution of the CAMPD MINLP. In this last step, a single (economic) objective was used. Zhou et al. (2019) also applied the SD to identify a list of solvent candidates in their MOO CAMD formulation, which were further optimized using rigorous thermodynamic analysis. In their CAMD problem, selectivity and capacity of the solvents were optimized simultaneously to consider their efficiency within extractive distillation process. Buxton et al. (1999) and Hugo et al. (2004) considered both process economics and environmental impact simultaneously in the formulation of a CAMPD MINLP and proposed the use of the ϵ-constraint method (Haimes et al., 1971) for its solution. The environmental impact metrics were treated as constraints in their formulation, while the economic performance was set as the objective function. Kim and Diwekar (2002) proposed a novel MOO framework based on the ϵ-constraint method to solve the integrated design of the solvent recycling process and environmentally benign solvents for acetic acid removal from water. In their study, a four-objective problem was transformed to single-objective optimization (SOO) problems and the solutions of the problem is obtained using the Hammersley stochastic annealing algorithm by which the number of SOO problems are efficiently sampled. Schilling et al. (2017) solved an integrated working fluid and ORC process design problem with the ϵ-constraint method to identify the trade-off between net power output and total capital investment. The CAMPD problem was formulated as a MINLP and a one-stage continuous-molecular targeting (CoMT) approach was applied. Ng et al. (2014) introduced a metaheuristic method, fuzzy optimization (Liang, 2008), in developing a MOO CAMPD approach to the design of optimal chemical products, by considering both the optimality of product properties and the accuracy of the property prediction models. Venkatasubramanian et al. (1994) employed a genetic algorithm (GA) in the design of a optimal structure of the polymer and refrigerant. They introduced a string representation of the molecular structures as an encoding strategy and used the molecular genetic operators (single-point crossover, chain-mutation, insertion, deletion, and blending) to SOO MINLP problems. Dörgő, Abonyi, 2016, Dörgő, Abonyi, 2019 extended the GA approaches to the design of refrigerants for ORC processes. The authors solved the MOO MINLP problems using non-dominated sorting genetic algorithm II (NSGA-II) (Deb, 2001) to investigate trade-offs between several properties of molecules. Xu and Diwekar (2007) developed a multi-objective efficient genetic algorithm (MOEGA) in the integrated design of solvents and solvent recycling process of mixtures of acetic acid and water . The authors considered up to six objectives including acetic acid recovery, process flexibility, two environmental impacts based on LC50, an environmental factor based on bioconcentration factor and the energy consumption of the reboiler.
While several MOO methods have been applied to CAM(P)D, CAM(P)D presents challenges due to the nonconvexity of the search space that arise from the continuous functions and the presence of integer variables. The Pareto front can be discontinuous and nonconvex, so that the efficient identification of a well-distributed set of points on or near the Pareto front is non-trivial. In particular, WS approaches are highly dependent on the choice of weight vectors. An even distribution of the weights among objective functions does not always leads to an even distribution of solutions on the Pareto front (Das and Dennis, 1997). Therefore, the use of the WS is often time-consuming as a large number of SOO problems need to be solved. Although this drawback has been addressed in the SD algorithm, in which the weight vectors are selected systematically, the WS and SD approaches can only identify convex regions of the Pareto front (Marler and Arora, 2004). This is in fact a limitation of all weighted sum based methods (Bortz et al., 2014). This limitation can be overcome with ϵ-constraint methods, but a challenge is to choose appropriate values of epsilon, ϵ, especially when the problem entails more than two objectives (Mavrotas, 2009). Moreover, the constraints added to the original problem increase the complexity of solving each SOO problem. Furthermore, it may be difficult to find even one point on the Pareto front and convergence to dominated solutions is common. In scalarization and ϵ-constraint methods, this is apparent when SOO problems are solved repeatedly and the mixed-integer and nonconvex nature of most molecular design problems means the solver may converge to a dominated solution rather than a Pareto point. In fuzzy methods and genetic algorithms, a set of solutions that is close to the Pareto front may be identified, but there can be no guarantee that the points found are Pareto-optimal (Deb, 2001). The application of these methods can result in a large number of unsuccessful computations due to the highly constrained nature of such MINLPs. Despite the many challenges faced in applying MOO to MINLPs, there has been no systematic analysis to compare the performance of various algorithms for CAM(P)D applications.
In this work, we present a comparative analysis of the performance of three classes of MINLP MOO approaches, the WS, theSD (Rennen et al., 2011), and the NSGA-II (Deb et al., 2000). The solution of each scalarized problem with WS and SD can be challenging in the presence of nonlinearities. To increase the likelihood of identifying the globally-optimal Pareto front, we introduce SA (Marcoulaki and Kokossis, 2000) and multi-level single linkage (MLSL) (Kucherenko and Sytsko, 2005) as global search methods. We first make use of a SA version of the WS (WSSA) and a MLSL version of the WS (WSML). These two algorithms are compared with the WS method to investigate the effectiveness of global search methods. Based on findings, we also put forward two variants of the algorithms: MLSL with the WS (WSML) and the SD (SDML), and a comparison between WSML, SDML and NSGA-II is undertaken.
The resulting set of algorithms is applied to three case studies: the design of solvents for the chemical absorption of CO2, the design of working fluids for ORC based on property metrics, and the integrated design of working fluids and ORCs processes. The performance of the different algorithms is compared based on reliability and efficiency criteria.
The main contributions of this work include: 1) the introduction of different global search methods to solve single objective MINLPs within MOO scalarization methods; 2) a systematic comparative study of the performance of different classes of MOO algorithms on solving several literature MOO CAM(P)D problems. In addition, several modifications of the SA, MLSL and NSGA-II approaches are introduced to adapt these algorithms to CAM(P)D problems. The remainder of this article is structured as follows. In Section 2, we describe the optimization algorithms. We present in Section 3 the performance metrics used to compare the algorithms. We formulate the case studies in Section 4. Subsequently, the results of the MOO algorithms when applied to the case studies are reported in Section 5, where the performance of each algorithm is discussed in detail. Finally, we provide the main conclusions in Section 6.
Section snippets
Problem formulation
The molecular design problem is often posed as MINLP problem. The generic mathematical formulation of the MINLP MOO problem is:where p is the number of objectives, x is a n-dimensional vector of continuous variables, y is a q-dimensional vector of binary or integer variables, n is a q-dimensional vector of integer variables, g(x, y, n) is a vector of inequality constraints that define design constraints and
Quality measures for multi-objective optimization
Since no single metric can represent the performance of the algorithms, a series of appropriate metrics is used to assess performance in the specific domain of molecular design. In particular, in developing the metrics, we take into account the fact that some of the case studies involve only discrete decision variables, in which case a full enumeration of the solution is possible and provides further insights. Other case studies include both discrete and continuous variables and such an
Case studies
The optimization methodologies presented are applied to three case studies to assess their performance and to examine the applicability of each method. The case studies are selected so that different levels of complexity are explored in terms of problem size and numerical difficulty.
Results and discussion
In this section, we compare the relative performance of all algorithms. Table 8 summarizes the comparison of the problem size for the model defined in each case study. The feasibility tests are not considered in the problem size of case study 3. All MOO methods are implemented with common subfunctions using the same programming language in Matlab 2018a and all runs are performed on a single Intel(R) Xeon(R) Gold 5122 CPU @ 3.60GHz processor with 384 GB of RAM. For the local solution of MINLPs,
Conclusions
In this paper, we have compared several MOO algorithms by assessing their performance on molecular design applications. The algorithms include two types of scalarization-based methods and one evolutionary algorithm. In order to avoid premature convergence to a suboptimal front, two global search algorithms were combined with one of the scalarization methods (weighted sum) and tested for reliability and efficiency. Two CAMD case studies and one CAMPD case study, each with a different size of
CRediT authorship contribution statement
Ye Seol Lee: Conceptualization, Methodology, Software, Formal analysis, Investigation, Data curation, Writing - original draft, Writing - review & editing, Visualization. Edward J. Graham: Software, Data curation, Investigation. Amparo Galindo: Writing - review & editing, Visualization, Supervision. George Jackson: Writing - review & editing, Supervision. Claire S. Adjiman: Conceptualization, Visualization, Writing - review & editing, Supervision, Project administration, Funding acquisition.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgements
The authors gratefully acknowledge financial support from the Engineering and Physical Sciences Research Council (EPSRC) of the UK (grant EP/J003840), European Union's Horizon 2020 research and innovation program under the grant agreement 727503 - ROLINCAP and Centre for Process Systems Engineering Research Committee of Imperial College London via Roger Sargent scholarship.
Data statement: Data underlying this article can be accessed on Zenodo at doi.org/10.5281/zenodo.3568383, and used under
References (67)
- et al.
Design of energy efficient reactive solvents for post combustion CO2 capture using computer aided approach
J. Cleaner Prod.
(2018) - et al.
Multi-criteria optimization in chemical process design and decision support by navigation on Pareto sets
Comput. Chem. Eng.
(2014) - et al.
Integrated working fluid-thermodynamic cycle design of organic Rankine cycle power systems for waste heat recovery
Appl. Energy
(2017) - et al.
Designing environmentally safe refrigerants using mathematical programming
Chem. Eng. Sci.
(1996) - et al.
A novel framework for simultaneous separation process and product design
Chem. Eng. Process.
(2004) - et al.
Design of environmentally benign processes: integration of solvent design and separation process synthesis
Comput. Chem. Eng.
(1999) - et al.
Viscosity estimation at low temperatures (Tr < 0.75) for organic liquids from group contributions
Chem. Eng. J.
(2002) - et al.
Group-contribution + (GC+) based estimation of properties of pure components: Improved property estimation and uncertainty analysis
Fluid Phase Equilib.
(2012) - et al.
Computer aided molecular design for alternative sustainable solvent to extract oil from palm pressed fibre
Process Saf. Environ. Prot.
(2017) Fuzzy multi-objective production/distribution planning decisions with multi-product and multi-time period in a supply chain
Comput. Ind. Eng.
(2008)