Elsevier

Metabolic Engineering

Volume 67, September 2021, Pages 453-463
Metabolic Engineering

Computational design and analysis of modular cells for large libraries of exchangeable product synthesis modules

https://doi.org/10.1016/j.ymben.2021.07.009Get rights and content

Highlights

  • Develop a novel computational method to design modular cells compatible with hundreds of product synthesis modules.

  • Design three Escherichia coli modular cells that can couple growth with product synthesis of a total of 85 molecules.

  • Identify removal of major byproducts and modification of branch points in central metabolism as key interventions for modular cell designs.

  • Design E. coli modular cells utilizing hexoses and pentoses and identify the limitations of pentoses for growth-coupled product synthesis.

  • Determine compatibility of an existing modular cell with new production modules.

Abstract

Microbial metabolism can be harnessed to produce a large library of useful chemicals from renewable resources such as plant biomass. However, it is laborious and expensive to create microbial biocatalysts to produce each new product. To tackle this challenge, we have recently developed modular cell (ModCell) design principles that enable rapid generation of production strains by assembling a modular (chassis) cell with exchangeable production modules to achieve overproduction of target molecules. Previous computational ModCell design methods are limited to analyze small libraries of around 20 products. In this study, we developed a new computational method, named ModCell-HPC, that can design modular cells for large libraries with hundreds of products with a highly-parallel and multi-objective evolutionary algorithm and enable us to elucidate modular design properties. We demonstrated ModCell-HPC to design Escherichia coli modular cells towards a library of 161 endogenous production modules. From these simulations, we identified E. coli modular cells with few genetic manipulations that can produce dozens of molecules in a growth-coupled manner with different types of fermentable sugars. These designs revealed key genetic manipulations at the chassis and module levels to accomplish versatile modular cells, involving not only in the removal of major by-products but also modification of branch points in the central metabolism. We further found that the effect of various sugar degradation on redox metabolism results in lower compatibility between a modular cell and production modules for growth on pentoses than hexoses. To better characterize the degree of compatibility, we developed a method to calculate the minimal set cover, identifying that only three modular cells are all needed to couple with all compatible production modules. By determining the unknown compatibility contribution metric, we further elucidated the design features that allow an existing modular cell to be re-purposed towards production of new molecules. Overall, ModCell-HPC is a useful tool for understanding modularity of biological systems and guiding more efficient and generalizable design of modular cells that help reduce research and development cost in biocatalysis.

Introduction

Modular design has gained recent interest as an effective approach to understand and redesign cellular systems (Garcia and Trinh, 2019a). In the fields of metabolic engineering and synthetic biology, various modularization strategies (Biggs et al., 2014; Trinh et al., 2015; Garcia and Trinh, 2019b, 2019c, 2020; Garcia et al., 2020) have been proposed to address the slow and expensive design-build-test cycles of developing microbial catalysts for renewable chemical synthesis (Nielsen and Keasling, 2016). A promising system-level modularization approach (Purnick and Weiss, 2009) is ModCell (Garcia and Trinh, 2019b), that aims to design a modular (chassis) cell compatible with exchangeable production modules that enable metabolite overproduction. ModCell could be used as an effective tool to design modular cells capable of efficiently producing a vast number of molecules offered by nature with minimal strain optimization requirements (Trinh and Mendoza, 2016; Lee et al., 2019), but it remains unexplored for large product libraries.

Previous efforts in computational modular cell design are limited to analyze small libraries of around 20 products (Garcia and Trinh, 2019b, 2020). However, the design of modular cells for larger product libraries is both of practical and theoretical interest. Theoretically, analyzing large libraries can discover more general modular cell design rules, which might help to explain the naturally existing modularity of metabolic networks (Garcia and Trinh, 2019a). Practically, such modular cells could be implemented with genetic engineering techniques that enable rapid pathway generation, such as combinatorial ester pathways (Layton and Trinh, 2014). These modular cells could serve as a versatile platform for pathway selection and optimization using adaptive laboratory evolution (Wilbanks et al., 2017).

Modular cell design was formulated as a multi-objective optimization problem (MOP), named ModCell2, where each target phenotype activated by a module is an independent objective (Garcia and Trinh, 2019b). ModCell2 was solved with multi-objective evolutionary algorithms (MOEAs) that used a master-slave parallelization scheme, where the objective functions are evaluated in parallel by slave processes, but every other step in the algorithm is performed serially (Fig. 1 a). (Garcia and Trinh, 2019b, 2019c) This approach contains many serial steps, and hence limits the scalability of the algorithm with the number of processes according to the Ahmdal's law (Hill and Marty, 2008). In particular, using large population sizes, an effective strategy to deal with many objectives (Garcia and Trinh, 2019c; Ishibuchi et al., 2009), could dramatically slow down serial algorithm operations such as non-dominated sorting in NSGA-II (Deb et al., 2002), one of the best performing MOEAs to solve ModCell2 (Garcia and Trinh, 2019c). Furthermore, increasing the product library size for ModCell leads to very large multi-objective optimization problems, which are notoriously difficult to solve (Ishibuchi et al., 2008; Li et al., 2018). Therefore, the master-slave approach used in ModCell2 is not suitable to analyze large problems that contain hundreds of exchangeable production modules. A new parallelization approach that utilizes high-performance computing (HPC) more effectively is needed to advance ModCell.

In recent years, multiple approaches to harness HPC have been developed to solve single-objective evolutionary algorithms (EAs) (Alba et al., 2013). In particular, the island-parallelization approach has been proposed, where multiple instances of the EAs are run independently but communicate with each other to enhance overall convergence towards optimal solutions (Fig. 1 b). This new approach helps address the serial bottlenecks of the master-slave approach by separating the algorithm into highly independent processes that directly map to the computing hardware. While this approach has not been thoroughly examined in MOEAs, there are a few successful applications to specific design problems (Martens and Izzo, 2013; Jozefowiez et al., 2005; García-Sánchez et al., 2016).

In this study, we developed ModCell-HPC, a highly parallel MOEA that uses the island parallelization approach to solve modular cell design problems with hundreds of objectives. We demonstrated ModCell-HPC to design Escherichia coli modular cells with a large production module library of metabolically and biochemically diverse endogenous compounds and developed analysis tools to elucidate the principles of modular design. We envision that ModCell-HPC provides a useful tool to study modularity of biological systems and guide more efficient and generalizable design of modular cells that help reduce research and development cost in biocatalysis.

Section snippets

Multi-objective optimization formulation of modular cell design problem

The modular (chassis) cell can be built in a top-down manner by removing metabolic functions from a parent strain, and then inserting exchangeable modules into the chassis to create production strains that optimally display the target phenotypes (Trinh et al., 2015; Garcia and Trinh, 2019b, 2020). Due to the conflicting metabolic requirements of different product synthesis pathways, the modular cell design problem is formulated as the following MOP known as ModCell2 (Garcia and Trinh, 2019b):max

Tuning of ModCell-HPC parameters

A known challenge of heuristic optimization approaches is their reliance on parameter tuning for rapid convergence towards optimal solutions. To identify sensible default parameters for ModCell-HPC, we first scanned parameter combinations with a previous 20-objectives problem (Garcia and Trinh, 2019b) that is fast to solve, then focused on the most relevant parameters for a large-scale problem with 161 objectives corresponding to the current product library. In both cases, we used two

Conclusions

In this study, we developed ModCell-HPC, a computational method to design modular cells compatible with hundreds of product synthesis modules. We applied ModCell-HPC to design E. coli modular cells with a product library of 161 endogenous metabolites. This resulted in many Pareto optimal designs for the production of these molecules, from which we identified three modular cells that include all compatible products. The designs feature strategies consistent with previous experimental studies

Acknowledgement

This research was supported by the NSF CAREER award (NSF1553250), the DOE BER Genomic Science Program (DE-SC0019412), and by The Center of Bioenergy Innovation (CBI), U.S. Department of Energy Bioenergy Research Center supported by the Office of Biological and Environmental Research in the DOE Office of Science. The funders had no role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References (47)

  • C.T. Trinh et al.

    Design, construction and performance of the most e_cient biomass producing E. coli bacterium

    Metab. Eng.

    (2006)
  • C.T. Trinh et al.

    Rational design of effcient modular cells

    Metab. Eng.

    (2015)
  • J.D. Winkler et al.

    The LASER database: formalizing design rules for metabolic engineering

    Metabolic Engineering Communications

    (2015)
  • E. Alba et al.

    Parallel metaheuristics: recent advances and new trends

    Int. Trans. Oper. Res.

    (2013)
  • S. Atsumi et al.

    Non-fermentative pathways for synthesis of branched chain higher alcohols as biofuels

    Nature

    (2008)
  • S. Atsumi

    Engineering the isobutanol biosynthetic pathway in Escherichia coli by comparison of three aldehyde reductase/alcohol dehydrogenase genes

    Appl. Microbiol. Biotechnol.

    (2010)
  • S. Boecker et al.

    Broadening the scope of enforced atp wasting as a tool for metabolic engineering in escherichia coli

    Biotechnol. J.

    (2019)
  • D. Christodoulou

    Reserve ux capacity in the pentose phosphate pathway enables Escherichia coli's rapid response to oxidative stress

    Cell systems

    (2018)
  • S. De Maeseneire et al.

    Metabolic characterisation of E. coli citrate synthase and phosphoenolpyruvate carboxylase mutants in aerobic cultures

    Biotechnol. Lett.

    (2006)
  • K. Deb et al.

    A fast and elitist multiobjective genetic algorithm: NSGA-II

    IEEE Trans. Evol. Comput.

    (2002)
  • M. Dragosits et al.

    Adaptive laboratory evolution–principles and applications for biotechnology

    Microb. Cell Factories

    (2013)
  • S. Garcia et al.

    Multiobjective strain design: a framework for modular cell engineering

    Metab. Eng.

    (2019)
  • S. Garcia et al.

    Comparison of multi-objective evolutionary algorithms to solve the modular cell design problem for novel biocatalysis

    Processes

    (2019)
  • Cited by (2)

    • Merging automation and fundamental discovery into the design–build–test–learn cycle of nontraditional microbes

      2022, Trends in Biotechnology
      Citation Excerpt :

      Guanidine-dependent reactions were inhibited through the use of the guanine analog 7-methylguanine, which ruled out a positive effect exerted by cAMP, suggesting that the cognate reaction(s) mediate the positive effect both on bacterial growth and antibiotic production [76]. Finally, a metabolism-centric approach to push cell factory construction to the next level by means of automation could rely on coupling bacterial growth to the biosynthesis of the molecule of interest, or coupling multiple metabolic modules that ultimately feed biomass formation [77–79]. Strains harboring growth-coupled modules can be evolved through adaptive laboratory evolution (ALE) to enrich the population with improved phenotypes [80,81].

    • Synthetic metabolism for biohalogenation

      2022, Current Opinion in Biotechnology
      Citation Excerpt :

      Engineering novel synthetic metabolic pathways in microbial hosts entails substantial metabolic modifications, and a combination of rational design and evolution is often required to attain meaningful titers and yields. In this section, we present a general framework for establishing synthetic biohalogenation—an approach that has been successfully implemented for pathway engineering in several bacterial hosts [38•,39–41]. This framework is based on three core aspects: (i) pathway modularization, facilitating the fine-tuning of gene expression as well as enzymes levels and catalytic activities thereof; (ii) metabolic rewiring of microbial strains for growth-coupled selection, enabling high-throughput, growth-based experiments for testing and optimizing pathway components; and (iii) adaptive laboratory evolution (ALE), exploiting a selective pressure to mediate network-wide rearrangements towards efficient synthetic metabolism.

    View full text