A comparative study of multi-objective optimization methodologies for molecular and process design

https://doi.org/10.1016/j.compchemeng.2020.106802Get rights and content

Abstract

The need to consider multiple objectives in molecular design, whether based on techno-economic, environmental or health and safety metrics is increasingly recognized. There is, however, limited understanding of the suitability of different multi-objective optimization (MOO) algorithm for the solution of such design problems. In this work, we present a systematic comparison of the performance of five mixed-integer non-linear programming (MINLP) MOO algorithms on the selection of computer-aided molecular design (CAMD) and computer-aided molecular and process design (CAMPD) problems. The five methods are designed to address the discrete and nonlinear nature of the problem, with the aim of generating an accurate approximation of the Pareto front. They include: a weighted sum approach without global search phases (SWS), a weighted sum approach with simulated annealing (WSSA), a weighted sum approach with multi level single linkage (WSML), the sandwich algorithm with MLSL (SDML) and the non dominated sorting genetic algorithm-II (NSGA-II). The algorithms are compared systematically in two steps. The effectiveness of the global search methods is evaluated with SWS, WSSA and WSML. WSML is found to be most effective and a comparative analysis of WSML, SDML and NSGA-II is then undertaken. As a test set of these optimization techniques, two CAMD and one CAMPD problems of varying dimensionality are formulated as case studies. The results show that the SDML provides the most efficient generation of a diverse set of Pareto points, leading to the construction of an approximate Pareto front close to exact Pareto front.

Introduction

The multiple ways in which the selection of a molecule can impact performance have been well documented through the computer aided molecular design (CAMD) and computer aided molecular and process design (CAMPD) literature. This is reflected in the diversity of objective functions considered: property metrics (e.g. Pretel, López, Bottini, Brignole, 1994, Gani, Jimenez-Gonzalez, ten Kate, Crafts, Jones, Powell, Atherton, Cordiner, 2006, Folić, Adjiman, Pistikopoulos, 2008, Austin, Sahinidis, Konstantinov, Trahan, 2018), economics (e.g. Pereira, Keskes, Galindo, Jackson, Adjiman, 2011, Gopinath, Jackson, Galindo, Adjiman, 2016, Ahmad, Hashim, Mustaffa, Maarof, Yunus, 2018), productivity metrics (e.g. Bardow, Steur, Gross, 2010, Bowskill et al., 2020, Cignitti, Andreasen, Haglind, Woodley, Abildskov, 2017, Eden, Jørgensen, Gani, El-Halwagi, 2004, Zhang, Qin, Peng, Zhou, Cheng, Chen, Qi, 2017), or environmental and safety criteria (e.g. Duvedi, Achenie, et al., 1996, Pistikopoulos, Stefanis, 1998, Hostrup, Harper, Gani, 1999, Song, Song, 2008, Papadopoulos, Stijepovic, Linke, Seferlis, Voutetakis, 2013, Khor, Liam, Loh, Tan, Ng, Hassim, Ng, Chemmangattuvalappil, 2017). An in-depth description of these is beyond the scope of our current paper and the reader is referred to Gopinath et al. (2016) and Ng et al. (2015) and the references therein for further details.

The diversity of objectives used highlights that it is beneficial in many cases to consider multiple conflicting objectives—for example, property targets, sustainability targets, societal impact, and economic performance—that cannot easily be combined together in a single metric. Multi-objective optimization (MOO) is thus receiving increasing attention in the area of CAM(P)D. In principle, for a problem with continuous decision variables, the presence of conflicting objectives results in an infinite number of optimal solutions, commonly known as Pareto-frontier solutions. Usually, it is not possible to derive an analytical description of the Pareto frontier (Deb, 2001). Hence, in practical applications, the Pareto frontier is approximated by a finite number of Pareto-optimal solutions (Marler and Arora, 2004).

In a nutshell, the main idea in the development of MOO algorithms is to (1) find non-dominated points that can represent the Pareto-frontier in reasonable computational time; and (2) generate these points so that they are distributed evenly along the Pareto front. The main MOO methods that have been used in molecular design include scalarization methods (e.g. weighted sum (WS) method, sandwich (SD) algorithm), ϵ-constraint methods, and metaheuristic methods. Papadopoulos and Linke (2006) proposed a multi-objective molecular design technique linked with a process synthesis framework by using the weighted sum method, extending it to the design of binary working fluid mixtures in Organic Rankine Cycles (ORC) (Papadopoulos et al., 2010) and to the design of solvents for CO2 capture (Papadopoulos, Badr, Chremos, Forte, Zarogiannis, Seferlis, Papadokonstantakis, Galindo, Jackson, Adjiman, 2016, Papadopoulos et al., 2020). The authors adopted the simulated annealing (SA) algorithm proposed by Marcoulaki and Kokossis (2000) to explore the design space directly. Burger et al. (2015) utilized the SD (Bortz et al., 2014) within their mixed-integer nonlinear programming (MINLP) solution strategy to design a solvent for a CO2 physical absorption process, avoiding the difficulty in assigning weight vectors. The solutions generated by MOO were used as starting points for the solution of the CAMPD MINLP. In this last step, a single (economic) objective was used. Zhou et al. (2019) also applied the SD to identify a list of solvent candidates in their MOO CAMD formulation, which were further optimized using rigorous thermodynamic analysis. In their CAMD problem, selectivity and capacity of the solvents were optimized simultaneously to consider their efficiency within extractive distillation process. Buxton et al. (1999) and Hugo et al. (2004) considered both process economics and environmental impact simultaneously in the formulation of a CAMPD MINLP and proposed the use of the ϵ-constraint method (Haimes et al., 1971) for its solution. The environmental impact metrics were treated as constraints in their formulation, while the economic performance was set as the objective function. Kim and Diwekar (2002) proposed a novel MOO framework based on the ϵ-constraint method to solve the integrated design of the solvent recycling process and environmentally benign solvents for acetic acid removal from water. In their study, a four-objective problem was transformed to single-objective optimization (SOO) problems and the solutions of the problem is obtained using the Hammersley stochastic annealing algorithm by which the number of SOO problems are efficiently sampled. Schilling et al. (2017) solved an integrated working fluid and ORC process design problem with the ϵ-constraint method to identify the trade-off between net power output and total capital investment. The CAMPD problem was formulated as a MINLP and a one-stage continuous-molecular targeting (CoMT) approach was applied. Ng et al. (2014) introduced a metaheuristic method, fuzzy optimization (Liang, 2008), in developing a MOO CAMPD approach to the design of optimal chemical products, by considering both the optimality of product properties and the accuracy of the property prediction models. Venkatasubramanian et al. (1994) employed a genetic algorithm (GA) in the design of a optimal structure of the polymer and refrigerant. They introduced a string representation of the molecular structures as an encoding strategy and used the molecular genetic operators (single-point crossover, chain-mutation, insertion, deletion, and blending) to SOO MINLP problems. Dörgő, Abonyi, 2016, Dörgő, Abonyi, 2019 extended the GA approaches to the design of refrigerants for ORC processes. The authors solved the MOO MINLP problems using non-dominated sorting genetic algorithm II (NSGA-II) (Deb, 2001) to investigate trade-offs between several properties of molecules. Xu and Diwekar (2007) developed a multi-objective efficient genetic algorithm (MOEGA) in the integrated design of solvents and solvent recycling process of mixtures of acetic acid and water . The authors considered up to six objectives including acetic acid recovery, process flexibility, two environmental impacts based on LC50, an environmental factor based on bioconcentration factor and the energy consumption of the reboiler.

While several MOO methods have been applied to CAM(P)D, CAM(P)D presents challenges due to the nonconvexity of the search space that arise from the continuous functions and the presence of integer variables. The Pareto front can be discontinuous and nonconvex, so that the efficient identification of a well-distributed set of points on or near the Pareto front is non-trivial. In particular, WS approaches are highly dependent on the choice of weight vectors. An even distribution of the weights among objective functions does not always leads to an even distribution of solutions on the Pareto front (Das and Dennis, 1997). Therefore, the use of the WS is often time-consuming as a large number of SOO problems need to be solved. Although this drawback has been addressed in the SD algorithm, in which the weight vectors are selected systematically, the WS and SD approaches can only identify convex regions of the Pareto front (Marler and Arora, 2004). This is in fact a limitation of all weighted sum based methods (Bortz et al., 2014). This limitation can be overcome with ϵ-constraint methods, but a challenge is to choose appropriate values of epsilon, ϵ, especially when the problem entails more than two objectives (Mavrotas, 2009). Moreover, the constraints added to the original problem increase the complexity of solving each SOO problem. Furthermore, it may be difficult to find even one point on the Pareto front and convergence to dominated solutions is common. In scalarization and ϵ-constraint methods, this is apparent when SOO problems are solved repeatedly and the mixed-integer and nonconvex nature of most molecular design problems means the solver may converge to a dominated solution rather than a Pareto point. In fuzzy methods and genetic algorithms, a set of solutions that is close to the Pareto front may be identified, but there can be no guarantee that the points found are Pareto-optimal (Deb, 2001). The application of these methods can result in a large number of unsuccessful computations due to the highly constrained nature of such MINLPs. Despite the many challenges faced in applying MOO to MINLPs, there has been no systematic analysis to compare the performance of various algorithms for CAM(P)D applications.

In this work, we present a comparative analysis of the performance of three classes of MINLP MOO approaches, the WS, theSD (Rennen et al., 2011), and the NSGA-II (Deb et al., 2000). The solution of each scalarized problem with WS and SD can be challenging in the presence of nonlinearities. To increase the likelihood of identifying the globally-optimal Pareto front, we introduce SA (Marcoulaki and Kokossis, 2000) and multi-level single linkage (MLSL) (Kucherenko and Sytsko, 2005) as global search methods. We first make use of a SA version of the WS (WSSA) and a MLSL version of the WS (WSML). These two algorithms are compared with the WS method to investigate the effectiveness of global search methods. Based on findings, we also put forward two variants of the algorithms: MLSL with the WS (WSML) and the SD (SDML), and a comparison between WSML, SDML and NSGA-II is undertaken.

The resulting set of algorithms is applied to three case studies: the design of solvents for the chemical absorption of CO2, the design of working fluids for ORC based on property metrics, and the integrated design of working fluids and ORCs processes. The performance of the different algorithms is compared based on reliability and efficiency criteria.

The main contributions of this work include: 1) the introduction of different global search methods to solve single objective MINLPs within MOO scalarization methods; 2) a systematic comparative study of the performance of different classes of MOO algorithms on solving several literature MOO CAM(P)D problems. In addition, several modifications of the SA, MLSL and NSGA-II approaches are introduced to adapt these algorithms to CAM(P)D problems. The remainder of this article is structured as follows. In Section 2, we describe the optimization algorithms. We present in Section 3 the performance metrics used to compare the algorithms. We formulate the case studies in Section 4. Subsequently, the results of the MOO algorithms when applied to the case studies are reported in Section 5, where the performance of each algorithm is discussed in detail. Finally, we provide the main conclusions in Section 6.

Section snippets

Problem formulation

The molecular design problem is often posed as MINLP problem. The generic mathematical formulation of the MINLP MOO problem is:minx,y,nf1o(x,y,n),f2o(x,y,n),...fpo(x,y,n)s.t.g(x,y,n)0h(x,y,n)=0xRn,y{0,1}q,nNZqwhere p is the number of objectives, x is a n-dimensional vector of continuous variables, y is a q-dimensional vector of binary or integer variables, n is a q-dimensional vector of integer variables, g(x, y, n) is a vector of inequality constraints that define design constraints and

Quality measures for multi-objective optimization

Since no single metric can represent the performance of the algorithms, a series of appropriate metrics is used to assess performance in the specific domain of molecular design. In particular, in developing the metrics, we take into account the fact that some of the case studies involve only discrete decision variables, in which case a full enumeration of the solution is possible and provides further insights. Other case studies include both discrete and continuous variables and such an

Case studies

The optimization methodologies presented are applied to three case studies to assess their performance and to examine the applicability of each method. The case studies are selected so that different levels of complexity are explored in terms of problem size and numerical difficulty.

Results and discussion

In this section, we compare the relative performance of all algorithms. Table 8 summarizes the comparison of the problem size for the model defined in each case study. The feasibility tests are not considered in the problem size of case study 3. All MOO methods are implemented with common subfunctions using the same programming language in Matlab 2018a and all runs are performed on a single Intel(R) Xeon(R) Gold 5122 CPU @ 3.60GHz processor with 384 GB of RAM. For the local solution of MINLPs,

Conclusions

In this paper, we have compared several MOO algorithms by assessing their performance on molecular design applications. The algorithms include two types of scalarization-based methods and one evolutionary algorithm. In order to avoid premature convergence to a suboptimal front, two global search algorithms were combined with one of the scalarization methods (weighted sum) and tested for reliability and efficiency. Two CAMD case studies and one CAMPD case study, each with a different size of

CRediT authorship contribution statement

Ye Seol Lee: Conceptualization, Methodology, Software, Formal analysis, Investigation, Data curation, Writing - original draft, Writing - review & editing, Visualization. Edward J. Graham: Software, Data curation, Investigation. Amparo Galindo: Writing - review & editing, Visualization, Supervision. George Jackson: Writing - review & editing, Supervision. Claire S. Adjiman: Conceptualization, Visualization, Writing - review & editing, Supervision, Project administration, Funding acquisition.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgements

The authors gratefully acknowledge financial support from the Engineering and Physical Sciences Research Council (EPSRC) of the UK (grant EP/J003840), European Union's Horizon 2020 research and innovation program under the grant agreement 727503 - ROLINCAP and Centre for Process Systems Engineering Research Committee of Imperial College London via Roger Sargent scholarship.

Data statement: Data underlying this article can be accessed on Zenodo at doi.org/10.5281/zenodo.3568383, and used under

References (67)

  • E. Marcoulaki et al.

    On the development of novel chemicals using a systematic optimisation approach. part ii. solvent design

    Chem. Eng. Sci.

    (2000)
  • G. Mavrotas

    Effective implementation of the ε-constraint method in multi-objective mathematical programming problems

    Appl. Math. Comput.

    (2009)
  • L.Y. Ng et al.

    Challenges and opportunities in computer-aided molecular design

    Comput. Chem. Eng.

    (2015)
  • O. Odele et al.

    Computer aided molecular design: a novel method for optimal solvent selection

    Fluid Phase Equilib.

    (1993)
  • O. Palma-Flores et al.

    Optimal molecular design of working fluids for sustainable low-temperature energy recovery

    Comput. Chem. Eng.

    (2015)
  • A.I. Papadopoulos et al.

    On the systematic design and selection of optimal working fluids for organic rankine cycles

    Appl. Therm. Eng.

    (2010)
  • F.E. Pereira et al.

    Integrated solvent and process design using a SAFT-VR thermodynamic description: high-pressure separation of carbon dioxide and methane

    Comput. Chem. Eng.

    (2011)
  • E. Pistikopoulos et al.

    Optimal solvent design for environmental impact minimization

    Comput. Chem. Eng.

    (1998)
  • N.V. Sahinidis et al.

    Applications of global optimization to process and molecular design

    Comput. Chem. Eng.

    (2000)
  • S. Sastri et al.

    A new temperature–thermal conductivity relationship for predicting saturated liquid thermal conductivity

    Chem. Eng. J.

    (1999)
  • R.S. Solanki et al.

    Approximating the noninferior set in multiobjective linear programming problems

    Eur. J. Oper. Res.

    (1993)
  • V. Venkatasubramanian et al.

    Computer-aided molecular design using genetic algorithms

    Comput. Chem. Eng.

    (1994)
  • J. Viswanathan et al.

    A combined penalty function and outer-approximation method for MINLP optimization

    Comput. Chem. Eng.

    (1990)
  • J. Zhang et al.

    Cosmo-descriptor based computer-aided ionic liquid design for separation processes: part ii: task-specific design for extraction processes

    Chem. Eng. Sci.

    (2017)
  • L. Zhang et al.

    Generic mathematical programming formulation and solution for computer-aided molecular design

    Comput. Chem. Eng.

    (2015)
  • T. Zhou et al.

    A hybrid stochastic-deterministic optimization approach for integrated solvent and process design

    Chem. Eng. Sci.

    (2017)
  • E. Aarts et al.

    Statistical cooling : a general approach to combinatorial optimization problems

    Philips J. Res.

    (1985)
  • N.D. Austin et al.

    Cosmo-based computer-aided molecular/mixture design: a focus on reaction solvents

    AIChE J.

    (2018)
  • A. Bardow et al.

    Continuous-molecular targeting for integrated solvent and process design

    Ind. Eng. Chem. Res.

    (2010)
  • R. Bokrantz et al.

    A dual algorithm for approximating Pareto sets in convex multi-criteria optimization

    Technology

    (2011)
  • D.H. Bowskill et al.

    Beyond a heuristic analysis: integration of process and working-fluid design for organic rankine cycles

    Mol. Syst. Des. Eng.

    (2020)
  • J. Burger et al.

    A hierarchical method to integrated solvent and process design of physical CO2 absorption using the SAFT-γ mie approach

    AIChE J.

    (2015)
  • A. Buxton et al.

    Optimal design of solvent blends for environmental impact minimization

    AIChE J.

    (1999)
  • Cited by (0)

    View full text