Computational design and analysis of modular cells for large libraries of exchangeable product synthesis modules
Introduction
Modular design has gained recent interest as an effective approach to understand and redesign cellular systems (Garcia and Trinh, 2019a). In the fields of metabolic engineering and synthetic biology, various modularization strategies (Biggs et al., 2014; Trinh et al., 2015; Garcia and Trinh, 2019b, 2019c, 2020; Garcia et al., 2020) have been proposed to address the slow and expensive design-build-test cycles of developing microbial catalysts for renewable chemical synthesis (Nielsen and Keasling, 2016). A promising system-level modularization approach (Purnick and Weiss, 2009) is ModCell (Garcia and Trinh, 2019b), that aims to design a modular (chassis) cell compatible with exchangeable production modules that enable metabolite overproduction. ModCell could be used as an effective tool to design modular cells capable of efficiently producing a vast number of molecules offered by nature with minimal strain optimization requirements (Trinh and Mendoza, 2016; Lee et al., 2019), but it remains unexplored for large product libraries.
Previous efforts in computational modular cell design are limited to analyze small libraries of around 20 products (Garcia and Trinh, 2019b, 2020). However, the design of modular cells for larger product libraries is both of practical and theoretical interest. Theoretically, analyzing large libraries can discover more general modular cell design rules, which might help to explain the naturally existing modularity of metabolic networks (Garcia and Trinh, 2019a). Practically, such modular cells could be implemented with genetic engineering techniques that enable rapid pathway generation, such as combinatorial ester pathways (Layton and Trinh, 2014). These modular cells could serve as a versatile platform for pathway selection and optimization using adaptive laboratory evolution (Wilbanks et al., 2017).
Modular cell design was formulated as a multi-objective optimization problem (MOP), named ModCell2, where each target phenotype activated by a module is an independent objective (Garcia and Trinh, 2019b). ModCell2 was solved with multi-objective evolutionary algorithms (MOEAs) that used a master-slave parallelization scheme, where the objective functions are evaluated in parallel by slave processes, but every other step in the algorithm is performed serially (Fig. 1 a). (Garcia and Trinh, 2019b, 2019c) This approach contains many serial steps, and hence limits the scalability of the algorithm with the number of processes according to the Ahmdal's law (Hill and Marty, 2008). In particular, using large population sizes, an effective strategy to deal with many objectives (Garcia and Trinh, 2019c; Ishibuchi et al., 2009), could dramatically slow down serial algorithm operations such as non-dominated sorting in NSGA-II (Deb et al., 2002), one of the best performing MOEAs to solve ModCell2 (Garcia and Trinh, 2019c). Furthermore, increasing the product library size for ModCell leads to very large multi-objective optimization problems, which are notoriously difficult to solve (Ishibuchi et al., 2008; Li et al., 2018). Therefore, the master-slave approach used in ModCell2 is not suitable to analyze large problems that contain hundreds of exchangeable production modules. A new parallelization approach that utilizes high-performance computing (HPC) more effectively is needed to advance ModCell.
In recent years, multiple approaches to harness HPC have been developed to solve single-objective evolutionary algorithms (EAs) (Alba et al., 2013). In particular, the island-parallelization approach has been proposed, where multiple instances of the EAs are run independently but communicate with each other to enhance overall convergence towards optimal solutions (Fig. 1 b). This new approach helps address the serial bottlenecks of the master-slave approach by separating the algorithm into highly independent processes that directly map to the computing hardware. While this approach has not been thoroughly examined in MOEAs, there are a few successful applications to specific design problems (Martens and Izzo, 2013; Jozefowiez et al., 2005; García-Sánchez et al., 2016).
In this study, we developed ModCell-HPC, a highly parallel MOEA that uses the island parallelization approach to solve modular cell design problems with hundreds of objectives. We demonstrated ModCell-HPC to design Escherichia coli modular cells with a large production module library of metabolically and biochemically diverse endogenous compounds and developed analysis tools to elucidate the principles of modular design. We envision that ModCell-HPC provides a useful tool to study modularity of biological systems and guide more efficient and generalizable design of modular cells that help reduce research and development cost in biocatalysis.
Section snippets
Multi-objective optimization formulation of modular cell design problem
The modular (chassis) cell can be built in a top-down manner by removing metabolic functions from a parent strain, and then inserting exchangeable modules into the chassis to create production strains that optimally display the target phenotypes (Trinh et al., 2015; Garcia and Trinh, 2019b, 2020). Due to the conflicting metabolic requirements of different product synthesis pathways, the modular cell design problem is formulated as the following MOP known as ModCell2 (Garcia and Trinh, 2019b):
Tuning of ModCell-HPC parameters
A known challenge of heuristic optimization approaches is their reliance on parameter tuning for rapid convergence towards optimal solutions. To identify sensible default parameters for ModCell-HPC, we first scanned parameter combinations with a previous 20-objectives problem (Garcia and Trinh, 2019b) that is fast to solve, then focused on the most relevant parameters for a large-scale problem with 161 objectives corresponding to the current product library. In both cases, we used two
Conclusions
In this study, we developed ModCell-HPC, a computational method to design modular cells compatible with hundreds of product synthesis modules. We applied ModCell-HPC to design E. coli modular cells with a product library of 161 endogenous metabolites. This resulted in many Pareto optimal designs for the production of these molecules, from which we identified three modular cells that include all compatible products. The designs feature strategies consistent with previous experimental studies
Acknowledgement
This research was supported by the NSF CAREER award (NSF1553250), the DOE BER Genomic Science Program (DE-SC0019412), and by The Center of Bioenergy Innovation (CBI), U.S. Department of Energy Bioenergy Research Center supported by the Office of Biological and Environmental Research in the DOE Office of Science. The funders had no role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript.
References (47)
- et al.
Multivariate modular metabolic engineering for pathway and strain optimization
Curr. Opin. Biotechnol.
(2014) - et al.
Improving NADPH availability for natural product biosynthesis in Escherichia coli by metabolic engineering
Metabolic Engineering
(2010) - et al.
Latent pathway activation and increased pathway capacity enable Escherichia coli adaptation to loss of key metabolic enzymes
J. Biol. Chem.
(2006) - et al.
Modular design: implementing proven engineering principles in biotechnology
Biotechnol. Adv.
(2019) - et al.
Engineering modular ester fermentative pathways in Escherichia coli
Metab. Eng.
(2014) A comprehensive metabolic map for production of bio-based chemicals
Nature Catalysis
(2019)- et al.
Rational design of a synthetic Entner–Doudoroff pathway for improved and controllable NADPH regeneration
Metab. Eng.
(2015) - et al.
Engineering cellular metabolism
Cell
(2016) - et al.
Metabolic ux analysis for a ppc mutant Escherichia coli based on 13C-labelling experiments together with enzyme activity as says and intracellular metabolite measurements
FEMS (Fed. Eur. Microbiol. Soc.) Microbiol. Lett.
(2004) - et al.
Modular cell design for rapid, efficient strain engineering toward industrialization of biology
Current Opinion in Chemical Engineering
(2016)
Design, construction and performance of the most e_cient biomass producing E. coli bacterium
Metab. Eng.
Rational design of effcient modular cells
Metab. Eng.
The LASER database: formalizing design rules for metabolic engineering
Metabolic Engineering Communications
Parallel metaheuristics: recent advances and new trends
Int. Trans. Oper. Res.
Non-fermentative pathways for synthesis of branched chain higher alcohols as biofuels
Nature
Engineering the isobutanol biosynthetic pathway in Escherichia coli by comparison of three aldehyde reductase/alcohol dehydrogenase genes
Appl. Microbiol. Biotechnol.
Broadening the scope of enforced atp wasting as a tool for metabolic engineering in escherichia coli
Biotechnol. J.
Reserve ux capacity in the pentose phosphate pathway enables Escherichia coli's rapid response to oxidative stress
Cell systems
Metabolic characterisation of E. coli citrate synthase and phosphoenolpyruvate carboxylase mutants in aerobic cultures
Biotechnol. Lett.
A fast and elitist multiobjective genetic algorithm: NSGA-II
IEEE Trans. Evol. Comput.
Adaptive laboratory evolution–principles and applications for biotechnology
Microb. Cell Factories
Multiobjective strain design: a framework for modular cell engineering
Metab. Eng.
Comparison of multi-objective evolutionary algorithms to solve the modular cell design problem for novel biocatalysis
Processes
Cited by (2)
Merging automation and fundamental discovery into the design–build–test–learn cycle of nontraditional microbes
2022, Trends in BiotechnologyCitation Excerpt :Guanidine-dependent reactions were inhibited through the use of the guanine analog 7-methylguanine, which ruled out a positive effect exerted by cAMP, suggesting that the cognate reaction(s) mediate the positive effect both on bacterial growth and antibiotic production [76]. Finally, a metabolism-centric approach to push cell factory construction to the next level by means of automation could rely on coupling bacterial growth to the biosynthesis of the molecule of interest, or coupling multiple metabolic modules that ultimately feed biomass formation [77–79]. Strains harboring growth-coupled modules can be evolved through adaptive laboratory evolution (ALE) to enrich the population with improved phenotypes [80,81].
Synthetic metabolism for biohalogenation
2022, Current Opinion in BiotechnologyCitation Excerpt :Engineering novel synthetic metabolic pathways in microbial hosts entails substantial metabolic modifications, and a combination of rational design and evolution is often required to attain meaningful titers and yields. In this section, we present a general framework for establishing synthetic biohalogenation—an approach that has been successfully implemented for pathway engineering in several bacterial hosts [38•,39–41]. This framework is based on three core aspects: (i) pathway modularization, facilitating the fine-tuning of gene expression as well as enzymes levels and catalytic activities thereof; (ii) metabolic rewiring of microbial strains for growth-coupled selection, enabling high-throughput, growth-based experiments for testing and optimizing pathway components; and (iii) adaptive laboratory evolution (ALE), exploiting a selective pressure to mediate network-wide rearrangements towards efficient synthetic metabolism.