Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Review Article
  • Published:

Mining genomes to illuminate the specialized chemistry of life

Abstract

All organisms produce specialized organic molecules, ranging from small volatile chemicals to large gene-encoded peptides, that have evolved to provide them with diverse cellular and ecological functions. As natural products, they are broadly applied in medicine, agriculture and nutrition. The rapid accumulation of genomic information has revealed that the metabolic capacity of virtually all organisms is vastly underappreciated. Pioneered mainly in bacteria and fungi, genome mining technologies are accelerating metabolite discovery. Recent efforts are now being expanded to all life forms, including protists, plants and animals, and new integrative omics technologies are enabling the increasingly effective mining of this molecular diversity.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Life’s chemical diversity.
Fig. 2: Linking genes to molecules using metabolomics and transcriptomics.
Fig. 3: Function-first genome mining approaches.

Similar content being viewed by others

References

  1. Davies, J. Specialized microbial metabolites: functions and origins. J. Antibiot. 66, 361–364 (2013).

    Article  CAS  Google Scholar 

  2. Chevrette, M. G. et al. Evolutionary dynamics of natural product biosynthesis in bacteria. Nat. Prod. Rep. 37, 566–599 (2020).

    Article  CAS  PubMed  Google Scholar 

  3. Erb, M. & Kliebenstein, D. J. Plant secondary metabolites as defenses, regulators, and primary metabolites: the blurred functional trichotomy. Plant. Physiol. 184, 39–52 (2020). This review provides a useful discussion on the categories of secondary metabolites, primary metabolites and hormones, and cases where these definitions overlap.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Ziemert, N., Alanjary, M. & Weber, T. The evolution of genome mining in microbes — a review. Nat. Prod. Rep. 33, 988–1005 (2016).

    Article  CAS  PubMed  Google Scholar 

  5. Medema, M. H. & Osbourn, A. Computational genomic identification and functional reconstitution of plant natural product biosynthetic pathways. Nat. Prod. Rep. 33, 951–962 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Keller, N. P. Fungal secondary metabolism: regulation, function and drug discovery. Nat. Rev. Microbiol. 17, 167–180 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Lockermann, G. Friedrich Wilhelm Serturner, the discoverer of morphine. J. Chem. Educ. 28, 277–279 (1951).

    Article  Google Scholar 

  8. Wang, M. et al. Sharing and community curation of mass spectrometry data with global natural products social molecular networking. Nat. Biotechnol. 34, 828–837 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Pye, C. R., Bertin, M. J., Lokey, R. S., Gerwick, W. H. & Linington, R. G. Retrospective analysis of natural products provides insights for future discovery trends. Proc. Natl Acad. Sci. USA 114, 5601–5606 (2017). This retrospective analysis quantifies bacterial and fungal natural products identified over the years and provides a perspective on the amount of structural novelty that is still being unearthed.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Bentley, S. D. et al. Complete genome sequence of the model actinomycete Streptomyces coelicolor A3(2). Nature 417, 141–147 (2002). This paper is, in many ways, the foundation for the field of natural product genome mining; the genome sequence of S. coelicolor makes it clear that the coding capacity for specialized metabolite production is much greater than the number of metabolites that have been discovered from this model species.

    Article  PubMed  Google Scholar 

  11. Lautru, S., Deeth, R. J., Bailey, L. M. & Challis, G. L. Discovery of a new peptide natural product by Streptomyces coelicolor genome mining. Nat. Chem. Biol. 1, 265–269 (2005).

    Article  CAS  PubMed  Google Scholar 

  12. Lin, X., Hopson, R. & Cane, D. E. Genome mining in Streptomyces coelicolor: molecular cloning and characterization of a new sesquiterpene synthase. J. Am. Chem. Soc. 128, 6022–6023 (2006).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Corre, C., Song, L., O’Rourke, S., Chater, K. F. & Challis, G. L. 2-Alkyl-4-hydroxymethylfuran-3-carboxylic acids, antibiotic production inducers discovered by Streptomyces coelicolor genome mining. Proc. Natl Acad. Sci. USA 105, 17510–17515 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Kersten, R. D. et al. A mass spectrometry-guided genome mining approach for natural product peptidogenomics. Nat. Chem. Biol. 7, 794–802 (2011). This study pioneers the use of feature-based matching to link genes to molecules, focusing on ribosomally and non-ribosomally synthesized peptides in bacteria.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Gomez-Escribano, J. P. et al. Structure and biosynthesis of the unusual polyketide alkaloid coelimycin P1, a metabolic product of the cpk gene cluster of Streptomyces coelicolor M145. Chem. Sci. 3, 2716 (2012).

    Article  CAS  Google Scholar 

  16. Cruz-Morales, P. et al. Phylogenomic analysis of natural products biosynthetic gene clusters allows discovery of arseno-organic metabolites in model streptomycetes. Genome Biol. Evol. 8, 1906–1916 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  17. Malpartida, F. & Hopwood, D. A. Molecular cloning of the whole biosynthetic pathway of a Streptomyces antibiotic and its expression in a heterologous host. Nature 309, 462–464 (1984).

    Article  CAS  PubMed  Google Scholar 

  18. Smith, D. J., Burnham, M. K. R., Edwards, J., Earl, A. J. & Turner, G. Cloning and heterologous expression of the penicillin biosynthetic gene cluster from Penicillium chrysogenum. Nat. Biotechnol. 8, 39–41 (1990).

    Article  CAS  Google Scholar 

  19. Feitelson, J. S., Malpartida, F. & Hopwood, D. A. Genetic and biochemical characterization of the red gene cluster of Streptomyces coelicolor A3(2). J. Gen. Microbiol. 131, 2431–2441 (1985).

    CAS  PubMed  Google Scholar 

  20. Fazio, G. C., Xu, R. & Matsuda, S. P. T. Genome mining to identify new plant triterpenoids. J. Am. Chem. Soc. 126, 5678–5679 (2004). This paper constitutes the first demonstration of genome mining from a plant.

    Article  CAS  PubMed  Google Scholar 

  21. Bergmann, S. et al. Genomics-driven discovery of PKS–NRPS hybrid metabolites from Aspergillus nidulans. Nat. Chem. Biol. 3, 213–217 (2007).

    Article  CAS  PubMed  Google Scholar 

  22. Franke, J., Ishida, K. & Hertweck, C. Genomics-driven discovery of burkholderic acid, a noncanonical, cryptic polyketide from human pathogenic Burkholderia species. Angew. Chem. Int. Ed. Engl. 51, 11611–11615 (2012).

    Article  CAS  PubMed  Google Scholar 

  23. Biggins, J. B., Ternei, M. A. & Brady, S. F. Malleilactone, a polyketide synthase-derived virulence factor encoded by the cryptic secondary metabolome of Burkholderia pseudomallei group pathogens. J. Am. Chem. Soc. 134, 13192–13195 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Pidot, S., Ishida, K., Cyrulies, M. & Hertweck, C. Discovery of clostrubin, an exceptional polyphenolic polyketide antibiotic from a strictly anaerobic bacterium. Angew. Chem. Int. Ed. Engl. 53, 7856–7859 (2014).

    Article  CAS  PubMed  Google Scholar 

  25. Claesen, J. & Bibb, M. Genome mining and genetic analysis of cypemycin biosynthesis reveal an unusual class of posttranslationally modified peptides. Proc. Natl Acad. Sci. USA 107, 16297–16302 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Du, Y.-L., He, H.-Y., Higgins, M. A. & Ryan, K. S. A heme-dependent enzyme forms the nitrogen–nitrogen bond in piperazate. Nat. Chem. Biol. 13, 836–838 (2017).

    Article  CAS  PubMed  Google Scholar 

  27. Tang, X. et al. Identification of thiotetronic acid antibiotic biosynthetic pathways by target-directed genome mining. ACS Chem. Biol. 10, 2841–2849 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Dassama, L. M. K., Kenney, G. E. & Rosenzweig, A. C. Methanobactins: from genome to function. Metallomics 9, 7–20 (2017).

    Article  CAS  PubMed  Google Scholar 

  29. de Rond, T., Asay, J. E. & Moore, B. S. Co-occurrence of enzyme domains guides the discovery of an oxazolone synthetase. Nat. Chem. Biol. https://doi.org/10.1038/s41589-021-00808-4 (2021).

    Article  PubMed  PubMed Central  Google Scholar 

  30. Obermaier, S. & Müller, M. Ibotenic acid biosynthesis in the fly agaric is initiated by glutamate hydroxylation. Angew. Chem. Int. Ed. Engl. 59, 12432–12435 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Marchand, J. A. et al. Discovery of a pathway for terminal-alkyne amino acid biosynthesis. Nature 567, 420–424 (2019).

    Article  CAS  PubMed  Google Scholar 

  32. Zhu, X., Liu, J. & Zhang, W. De novo biosynthesis of terminal alkyne-labeled natural products. Nat. Chem. Biol. 11, 115–120 (2015).

    Article  CAS  PubMed  Google Scholar 

  33. Ng, T. L., Rohac, R., Mitchell, A. J., Boal, A. K. & Balskus, E. P. An N-nitrosating metalloenzyme constructs the pharmacophore of streptozotocin. Nature 566, 94–99 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  34. Waldman, A. J. & Balskus, E. P. Discovery of a diazo-forming enzyme in cremeomycin biosynthesis. J. Org. Chem. 83, 7539–7546 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. Agarwal, V. et al. Metagenomic discovery of polybrominated diphenyl ether biosynthesis by marine sponges. Nat. Chem. Biol. 13, 537–543 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Baccile, J. A. et al. Plant-like biosynthesis of isoquinoline alkaloids in Aspergillus fumigatus. Nat. Chem. Biol. 12, 419–424 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. Caputi, L. et al. Missing enzymes in the biosynthesis of the anticancer drug vinblastine in Madagascar periwinkle. Science 360, 1235–1239 (2018).

    Article  CAS  PubMed  Google Scholar 

  38. Satake, M. et al. Brevisin: an aberrant polycyclic ether structure from the dinoflagellate Karenia brevis and its implications for polyether assembly. J. Org. Chem. 74, 989–994 (2009).

    Article  CAS  PubMed  Google Scholar 

  39. Sinninghe Damsté, J. S. et al. Linearly concatenated cyclobutane lipids form a dense bacterial membrane. Nature 419, 708–712 (2002).

    Article  PubMed  CAS  Google Scholar 

  40. Rattray, J. E. et al. A comparative genomics study of genetic products potentially encoding ladderane lipid biosynthesis. Biol. Direct 4, 8 (2009).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  41. Arnison, P. G. et al. Ribosomally synthesized and post-translationally modified peptide natural products: overview and recommendations for a universal nomenclature. Nat. Prod. Rep. 30, 108–160 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  42. Montalbán-López, M. et al. New developments in RiPP discovery, enzymology and engineering. Nat. Prod. Rep. 38, 130–239 (2021).

    Article  PubMed  Google Scholar 

  43. Li, Y. & Rebuffat, S. The manifold roles of microbial ribosomal peptide-based natural products in physiology and ecology. J. Biol. Chem. 295, 34–54 (2020).

    Article  CAS  PubMed  Google Scholar 

  44. Hansen, J. N., Norman Hansen, J. & Sandine, W. E. Nisin as a model food preservative. Crit. Rev. Food Sci. Nutr. 34, 69–93 (1994).

    Article  CAS  PubMed  Google Scholar 

  45. Schmidtko, A., Lötsch, J., Freynhagen, R. & Geisslinger, G. Ziconotide for treatment of severe chronic pain. Lancet 375, 1569–1577 (2010).

    Article  CAS  PubMed  Google Scholar 

  46. Morinaka, B. I. et al. Natural noncanonical protein splicing yields products with diverse β-amino acid residues. Science 359, 779–782 (2018).

    Article  CAS  PubMed  Google Scholar 

  47. Freeman, M. F., Helf, M. J., Bhushan, A., Morinaka, B. I. & Piel, J. Seven enzymes create extraordinary molecular complexity in an uncultivated bacterium. Nat. Chem. 9, 387–395 (2017).

    Article  CAS  PubMed  Google Scholar 

  48. Umemura, M. et al. Characterization of the biosynthetic gene cluster for the ribosomally synthesized cyclic peptide ustiloxin B in Aspergillus flavus. Fungal Genet. Biol. 68, 23–30 (2014).

    Article  CAS  PubMed  Google Scholar 

  49. Nagano, N. et al. Class of cyclic ribosomal peptide synthetic genes in filamentous fungi. Fungal Genet. Biol. 86, 58–70 (2016).

    Article  CAS  PubMed  Google Scholar 

  50. Kersten, R. D. & Weng, J.-K. Gene-guided discovery and engineering of branched cyclic peptides in plants. Proc. Natl Acad. Sci. USA 115, E10961–E10969 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  51. Jordan, P. A. & Moore, B. S. Biosynthetic pathway connects cryptic ribosomally synthesized posttranslationally modified peptide genes with pyrroloquinoline alkaloids. Cell Chem. Biol. 23, 1504–1514 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  52. Ting, C. P. et al. Use of a scaffold peptide in the biosynthesis of amino acid-derived natural products. Science 365, 280–284 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  53. Kazandjian, T. D. et al. Convergent evolution of pain-inducing defensive venom components in spitting cobras. Science 371, 386–390 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  54. Pineda, S. S. et al. Structural venomics reveals evolution of a complex venom by duplication and diversification of an ancient peptide-encoding gene. Proc. Natl Acad. Sci. USA 117, 11399–11408 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  55. Sanggaard, K. W. et al. Spider genomes provide insight into composition and evolution of venom and silk. Nat. Commun. 5, 3765 (2014).

    Article  CAS  PubMed  Google Scholar 

  56. Wang, G. Human antimicrobial peptides and proteins. Pharmaceuticals 7, 545–594 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  57. Gu, S. et al. Competition for iron drives phytopathogen control by natural rhizosphere microbiomes. Nat. Microbiol. 5, 1002–1010 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  58. Carrión, V. J. et al. Pathogen-induced activation of disease-suppressive functions in the endophytic root microbiome. Science 366, 606–612 (2019). This study illustrates how metagenome mining can be used to identify biosynthetic genes responsible for a microbiome-associated phenotype, fungal disease suppression in this case.

    Article  PubMed  CAS  Google Scholar 

  59. Guo, C.-J. et al. Discovery of reactive microbiota-derived metabolites that inhibit host proteases. Cell 168, 517–526.e18 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  60. Santhanam, R. et al. Native root-associated bacteria rescue a plant from a sudden-wilt disease that emerged during continuous cropping. Proc. Natl Acad. Sci. USA 112, E5013–E5020 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  61. Durán, P. et al. Microbial interkingdom interactions in roots promote Arabidopsis survival. Cell 175, 973–983.e14 (2018).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  62. D’hoe, K. et al. Integrated culturing, modeling and transcriptomics uncovers complex interactions and emergent behavior in a three-species synthetic gut community. eLife 7, e37090 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  63. Smanski, M. J. et al. Synthetic biology to access and expand nature’s chemical diversity. Nat. Rev. Microbiol. 14, 135–149 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  64. Reed, J. et al. A translational synthetic biology platform for rapid access to gram-scale quantities of novel drug-like molecules. Metab. Eng. 42, 185–193 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  65. Eng, C. H. et al. ClusterCAD: a computational platform for type I modular polyketide synthase design. Nucleic Acids Res. 46, D509–D515 (2018).

    Article  CAS  PubMed  Google Scholar 

  66. Udwary, D. W. et al. Genome sequencing reveals complex secondary metabolome in the marine actinomycete Salinispora tropica. Proc. Natl Acad. Sci. USA 104, 10376–10381 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  67. Omura, S. et al. Genome sequence of an industrial microorganism Streptomyces avermitilis: deducing the ability of producing secondary metabolites. Proc. Natl Acad. Sci. USA 98, 12215–12220 (2001).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  68. Oliynyk, M. et al. Complete genome sequence of the erythromycin-producing bacterium Saccharopolyspora erythraea NRRL23338. Nat. Biotechnol. 25, 447–453 (2007).

    Article  CAS  PubMed  Google Scholar 

  69. Leao, T. et al. Comparative genomics uncovers the prolific and distinctive metabolic potential of the cyanobacterial genus. Proc. Natl Acad. Sci. USA 114, 3198–3203 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  70. Ju, K.-S. et al. Discovery of phosphonic acid natural products by mining the genomes of 10,000 actinomycetes. Proc. Natl Acad. Sci. USA 112, 12175–12180 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  71. Shigdel, U. K. et al. Genomic discovery of an evolutionarily programmed modality for small-molecule targeting of an intractable protein surface. Proc. Natl Acad. Sci. USA 117, 17195–17203 (2020). This study describes the analysis of 135,000 actinobacterial genomes to identify new analogues of the immunosuppressant polyketide rapamycin.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  72. Donia, M. S. et al. A systematic analysis of biosynthetic gene clusters in the human microbiome reveals a common family of antibiotics. Cell 158, 1402–1414 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  73. Mendes, R. et al. Deciphering the rhizosphere microbiome for disease-suppressive bacteria. Science 332, 1097–1100 (2011).

    Article  CAS  PubMed  Google Scholar 

  74. Wilson, M. C. et al. An environmental bacterial taxon with a large and distinct metabolic repertoire. Nature 506, 58–62 (2014). This paper reports the discovery of a new taxonomic group of thus far uncultivated bacteria, prominent among sponge microbiota, with a high biosynthetic capacity; it thus highlights the importance of mining the ‘uncultivated majority’.

    Article  CAS  PubMed  Google Scholar 

  75. Owen, J. G. et al. Mapping gene clusters within arrayed metagenomic libraries to expand the structural diversity of biomedically relevant natural products. Proc. Natl Acad. Sci. USA 110, 11797–11802 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  76. Charlop-Powers, Z. et al. Global biogeographic sampling of bacterial secondary metabolism. eLife 4, e05048 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  77. Brady, S. F., Chao, C. J., Handelsman, J. & Clardy, J. Cloning and heterologous expression of a natural product biosynthetic gene cluster from eDNA. Org. Lett. 3, 1981–1984 (2001).

    Article  CAS  PubMed  Google Scholar 

  78. Medema, M. H. et al. Minimum Information about a Biosynthetic Gene cluster. Nat. Chem. Biol. 11, 625–631 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  79. Reddy, B. V. B., Milshteyn, A., Charlop-Powers, Z. & Brady, S. F. eSNaPD: a versatile, web-based bioinformatics platform for surveying and mining natural product biosynthetic diversity from metagenomes. Chem. Biol. 21, 1023–1033 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  80. Hover, B. M. et al. Culture-independent discovery of the malacidins as calcium-dependent antibiotics with activity against multidrug-resistant Gram-positive pathogens. Nat. Microbiol. 3, 415–422 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  81. Peek, J. et al. Rifamycin congeners kanglemycins are active against rifampicin-resistant bacteria via a distinct mechanism. Nat. Commun. 9, 4147 (2018).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  82. Trail, F. et al. Physical and transcriptional map of an aflatoxin gene cluster in Aspergillus parasiticus and functional disruption of a gene involved early in the aflatoxin pathway. Appl. Environ. Microbiol. 61, 2665–2673 (1995).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  83. Kennedy, J. et al. Modulation of polyketide synthase activity by accessory proteins during lovastatin biosynthesis. Science 284, 1368–1372 (1999).

    Article  CAS  PubMed  Google Scholar 

  84. Mounaud, S. et al. Annotated genome sequence of Aspergillus tanneri NIH1004. Microbiol. Resour. Announc. 9, e01374-19 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  85. Lange, B. M. et al. Probing essential oil biosynthesis and secretion by functional evaluation of expressed sequence tags from mint glandular trichomes. Proc. Natl Acad. Sci. USA 97, 2934–2939 (2000).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  86. Jung, J. D. et al. Discovery of genes for ginsenoside biosynthesis by analysis of ginseng expressed sequence tags. Plant. Cell Rep. 22, 224–230 (2003).

    Article  CAS  PubMed  Google Scholar 

  87. Teoh, K. H., Polichuk, D. R., Reed, D. W., Nowak, G. & Covello, P. S. Artemisia annua L. (Asteraceae) trichome-specific cDNAs reveal CYP71AV1, a cytochrome P450 with a key role in the biosynthesis of the antimalarial sesquiterpene lactone artemisinin. FEBS Lett. 580, 1411–1416 (2006).

    Article  CAS  PubMed  Google Scholar 

  88. Field, B. & Osbourn, A. E. Metabolic diversification-independent assembly of operon-like gene clusters in different plants. Science 320, 543–547 (2008). This analysis provides a foundation for the study of biosynthetic gene clusters in plants, making it clear that these have evolved specifically in plants themselves.

    Article  CAS  PubMed  Google Scholar 

  89. Nützmann, H., Huang, A. & Osbourn, A. Plant metabolic clusters — from genetics to genomics. N. Phytol. 211, 771–789 (2016).

    Article  Google Scholar 

  90. Luo, X. et al. Complete biosynthesis of cannabinoids and their unnatural analogues in yeast. Nature 567, 123–126 (2019).

    Article  CAS  PubMed  Google Scholar 

  91. Galanie, S., Thodey, K., Trenchard, I. J., Filsinger Interrante, M. & Smolke, C. D. Complete biosynthesis of opioids in yeast. Science 349, 1095–1100 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  92. Brunson, J. K. et al. Biosynthesis of the neurotoxin domoic acid in a bloom-forming diatom. Science 361, 1356–1358 (2018). This article presents the first (major) genome mining effort in protists, revealing the biosynthetic pathway for domoic acid production in diatoms.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  93. Kita, M. & Uemura, D. Marine huge molecules: the longest carbon chains in natural products. Chem. Rec. 10, 48–52 (2010).

    Article  CAS  PubMed  Google Scholar 

  94. Chow, M. H., Yan, K. T. H., Bennett, M. J. & Wong, J. T. Y. Birefringence and DNA condensation of liquid crystalline chromosomes. Eukaryot. Cell 9, 1577–1587 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  95. Beedessee, G. et al. Integrated omics unveil the secondary metabolic landscape of a basal dinoflagellate. BMC Biol. 18, 139 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  96. Delmont, T. O. et al. Functional repertoire convergence of distantly related eukaryotic plankton lineages revealed by genome-resolved metagenomics. Preprint at bioRxiv https://doi.org/10.1101/2020.10.15.341214 (2021).

    Article  Google Scholar 

  97. Zan, J. et al. A microbial factory for defensive kahalalides in a tripartite marine symbiosis. Science 364, eaaw6732 (2019).

    Article  CAS  PubMed  Google Scholar 

  98. Vaelli, P. M. et al. The skin microbiome facilitates adaptive tetrodotoxin production in poisonous newts. eLife 9, e53898 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  99. Gizzi, A. S. et al. A naturally occurring antiviral ribonucleotide encoded by the human genome. Nature 558, 610–614 (2018). This study describes the discovery of a novel host-produced antiviral specialized metabolite in humans guided by the knowledge that what turned out to be the biosynthetic gene conferred a viral-resistance phenotype.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  100. Cooke, T. F. et al. Genetic mapping and biochemical basis of yellow feather pigmentation in budgerigars. Cell 171, 427–439.e21 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  101. Sabatini, M. et al. Biochemical characterization of the minimal domains of an iterative eukaryotic polyketide synthase. FEBS J. 285, 4494–4511 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  102. Torres, J. P., Lin, Z., Winter, J. M., Krug, P. J. & Schmidt, E. W. Animal biosynthesis of complex polyketides in a photosynthetic partnership. Nat. Commun. 11, 2882 (2020). This study shows that animals can produce complex polyketides, with the discovery of polypropionate compounds produced by sea slugs.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  103. Cutignano, A. et al. Biosynthesis and cellular localization of functional polyketides in the gastropod mollusc Scaphander lignarius. Chembiochem 13, 1759–1766 (2012). 1701.

    Article  CAS  PubMed  Google Scholar 

  104. Beran, F. et al. Novel family of terpene synthases evolved from trans-isoprenyl diphosphate synthases in a flea beetle. Proc. Natl Acad. Sci. USA 113, 2922–2927 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  105. Safavi-Hemami, H. et al. Modulation of conotoxin structure and function is achieved through a multienzyme complex in the venom glands of cone snails. J. Biol. Chem. 287, 34288–34303 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  106. Roelofs, D. et al. A functional isopenicillin N synthase in an animal genome. Mol. Biol. Evol. 30, 541–548 (2013).

    Article  CAS  PubMed  Google Scholar 

  107. Suring, W., Mariën, J., Broekman, R., van Straalen, N. M. & Roelofs, D. Biochemical pathways supporting β-lactam biosynthesis in the springtail Folsomia candida. Biol. Open 5, 1784–1789 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  108. Shou, Q. et al. A hybrid polyketide–nonribosomal peptide in nematodes that promotes larval survival. Nat. Chem. Biol. 12, 770–772 (2016). This article identifies a hybrid peptide–polyketide produced by nematodes, which promotes larval survival.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  109. Izoré, T. et al. Drosophila melanogaster nonribosomal peptide synthetase Ebony encodes an atypical condensation domain. Proc. Natl Acad. Sci. USA 116, 2913–2918 (2019).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  110. Chekan, J. R. et al. Scalable biosynthesis of the seaweed neurochemical, kainic acid. Angew. Chem. Int. Ed. Engl. 58, 8454–8457 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  111. Medema, M. H., Cimermancic, P., Sali, A., Takano, E. & Fischbach, M. A. A systematic computational analysis of biosynthetic gene cluster evolution: lessons for engineering biosynthesis. PLoS Comput. Biol. 10, e1004016 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  112. Blin, K. et al. antiSMASH 5.0: updates to the secondary metabolite genome mining pipeline. Nucleic Acids Res. 47, W81–W87 (2019). This article describes the antiSMASH pipeline, originally established in 2011, the first automated software tool to comprehensively identify BGCs in both bacterial and fungal genomes.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  113. Skinnider, M. A., Merwin, N. J., Johnston, C. W. & Magarvey, N. A. PRISM 3: expanded prediction of natural product chemical structures from microbial genomes. Nucleic Acids Res. 45, W49–W54 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  114. Eddy, S. R. Profile hidden Markov models. Bioinformatics 14, 755–763 (1998).

    Article  CAS  PubMed  Google Scholar 

  115. Cimermancic, P. et al. Insights into secondary metabolism from a global analysis of prokaryotic biosynthetic gene clusters. Cell 158, 412–421 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  116. Hannigan, G. D. et al. A deep learning genome-mining strategy for biosynthetic gene cluster prediction. Nucleic Acids Res. 47, e110 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  117. van Heel, A. J. et al. BAGEL4: a user-friendly web server to thoroughly mine RiPPs and bacteriocins. Nucleic Acids Res. 46, W278–W281 (2018).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  118. Tietz, J. I. et al. A new genome-mining tool redefines the lasso peptide biosynthetic landscape. Nat. Chem. Biol. 13, 470–478 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  119. Santos-Aberturas, J. et al. Uncovering the unexplored diversity of thioamidated ribosomal peptides in Actinobacteria using the RiPPER genome mining tool. Nucleic Acids Res. 47, 4624–4637 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  120. de Los Santos, E. L. C. NeuRiPP: neural network identification of RiPP precursor peptides. Sci. Rep. 9, 13406 (2019).

    Article  PubMed  CAS  Google Scholar 

  121. Merwin, N. J. et al. DeepRiPP integrates multiomics data to automate discovery of novel ribosomally synthesized natural products. Proc. Natl Acad. Sci. USA 117, 371–380 (2020).

    Article  CAS  PubMed  Google Scholar 

  122. Kloosterman, A. M. et al. Expansion of RiPP biosynthetic space through integration of pan-genomics and machine learning uncovers a novel class of lanthipeptides. PLoS Biol. 18, e3001026 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  123. Palaniappan, K. et al. IMG-ABC v.5.0: an update to the IMG/Atlas of biosynthetic gene clusters knowledgebase. Nucleic Acids Res. 48, D422–D430 (2020).

    CAS  PubMed  Google Scholar 

  124. Blin, K. et al. The antiSMASH database version 2: a comprehensive resource on secondary metabolite biosynthetic gene clusters. Nucleic Acids Res. 47, D625–D630 (2019).

    Article  CAS  PubMed  Google Scholar 

  125. Schläpfer, P. et al. Genome-wide prediction of metabolic enzymes, pathways, and gene clusters in plants. Plant. Physiol. 173, 2041–2059 (2017).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  126. Kautsar, S. A., Suarez Duran, H. G., Blin, K., Osbourn, A. & Medema, M. H. plantiSMASH: automated identification, annotation and expression analysis of plant biosynthetic gene clusters. Nucleic Acids Res. 45, W55–W63 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  127. Navarro-Muñoz, J. C. et al. A computational framework to explore large-scale biosynthetic diversity. Nat. Chem. Biol. 16, 60–68 (2020). This article presents a comprehensive pipeline for natural product genome mining across large numbers of genomes, including sequence similarity networking, gene cluster family assignment and multilocus phylogenetic analysis of related gene clusters.

    Article  PubMed  CAS  Google Scholar 

  128. Liu, Z. et al. Drivers of metabolic diversification: how dynamic genomic neighbourhoods generate new biosynthetic pathways in the Brassicaceae. N. Phytol. 227, 1109–1123 (2020).

    Article  CAS  Google Scholar 

  129. Schorn, M. A. et al. A community resource for paired genomic and metabolomic data mining. Nat. Chem. Biol. 17, 363–368 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  130. Duncan, K. R. et al. Molecular networking and pattern-based genome mining improves discovery of biosynthetic gene clusters and their products from Salinispora species. Chem. Biol. 22, 460–471 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  131. Goering, A. W. et al. Metabologenomics: correlation of microbial gene clusters with metabolites drives discovery of a nonribosomal peptide with an unusual amino acid monomer. ACS Cent. Sci. 2, 99–108 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  132. Doroghazi, J. R. et al. A roadmap for natural product discovery based on large-scale genomics and metabolomics. Nat. Chem. Biol. 10, 963–968 (2014). This article is the first comprehensive example of metabologenomic pattern-based genome mining to correlate the presence/absence of metabolites to the presence/absence of BGCs across large numbers of strains.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  133. van der Hooft, J. J. J. et al. Linking genomics and metabolomics to chart specialized metabolic diversity. Chem. Soc. Rev. 49, 3297–3314 (2020).

    Article  PubMed  Google Scholar 

  134. Parkinson, E. I. et al. Discovery of the tyrobetaine natural products and their biosynthetic gene cluster via metabologenomics. ACS Chem. Biol. 13, 1029–1037 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  135. Eldjárn, G. H. et al. Ranking microbial metabolomic and genomic links in the NPLinker framework using complementary scoring functions. Preprint at bioRxiv https://doi.org/10.1101/2020.06.12.148205 (2020).

    Article  Google Scholar 

  136. Kersten, R. D. et al. Glycogenomics as a mass spectrometry-guided genome-mining method for microbial glycosylated molecules. Proc. Natl Acad. Sci. USA 110, E4407–E4416 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  137. Medema, M. H. et al. Pep2Path: automated mass spectrometry-guided genome mining of peptidic natural products. PLoS Comput. Biol. 10, e1003822 (2014).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  138. Mohimani, H. et al. Automated genome mining of ribosomal peptide natural products. ACS Chem. Biol. 9, 1545–1551 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  139. Cao, L. et al. MetaMiner: a scalable peptidogenomics approach for discovery of ribosomal peptide natural products with blind modifications from microbial communities. Cell Syst. 9, 600–608.e4 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  140. Vogt, E. & Künzler, M. Discovery of novel fungal RiPP biosynthetic pathways and their application for the development of peptide therapeutics. Appl. Microbiol. Biotechnol. 103, 5567–5581 (2019).

    Article  CAS  PubMed  Google Scholar 

  141. Dejong, C. A. et al. Polyketide and nonribosomal peptide retro-biosynthesis and global gene cluster matching. Nat. Chem. Biol. 12, 1007–1014 (2016). This article uses a retrobiosynthetic approach to break down metabolites into their constituent building blocks and match these strings of building blocks to gene clusters using substrate specificity predictions of the encoded enzyme sequences.

    Article  CAS  PubMed  Google Scholar 

  142. Luo, D. et al. Oxidation and cyclization of casbene in the biosynthesis of Euphorbia factors from mature seeds of Euphorbia lathyris L. Proc. Natl Acad. Sci. USA 113, E5082–E5089 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  143. Jeon, J. E. et al. A pathogen-responsive gene cluster for highly modified fatty acids in tomato. Cell 180, 176–187.e19 (2020). This paper arguably represents the most comprehensive single co-expression data set used thus far for genome mining of a novel plant biosynthetic pathway.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  144. Itkin, M. et al. Biosynthesis of antinutritional alkaloids in solanaceous crops is mediated by clustered genes. Science 341, 175–179 (2013).

    Article  CAS  PubMed  Google Scholar 

  145. Lau, W. & Sattely, E. S. Six enzymes from mayapple that complete the biosynthetic pathway to the etoposide aglycone. Science 349, 1224–1228 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  146. Rajniak, J., Barco, B., Clay, N. K. & Sattely, E. S. A new cyanogenic metabolite in Arabidopsis required for inducible pathogen defence. Nature 525, 376–379 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  147. Saelens, W., Cannoodt, R. & Saeys, Y. A comprehensive evaluation of module detection methods for gene expression data. Nat. Commun. 9, 1090 (2018).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  148. Wisecaver, J. H. et al. A global coexpression network approach for connecting genes to specialized metabolic pathways in plants. Plant. Cell 29, 944–959 (2017). This study introduces a powerful co-expression-based pathway discovery method, using mutual ranks and clustering to identify co-expression modules.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  149. Becher, P. G. et al. Developmentally regulated volatiles geosmin and 2-methylisoborneol attract a soil arthropod to Streptomyces bacteria promoting spore dispersal. Nat. Microbiol. 5, 821–829 (2020).

    Article  CAS  PubMed  Google Scholar 

  150. Muhlemann, J. K., Younts, T. L. B. & Muday, G. K. Flavonols control pollen tube growth and integrity by regulating ROS homeostasis during high-temperature stress. Proc. Natl Acad. Sci. USA 115, E11188–E11197 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  151. Bruns, H. et al. Function-related replacement of bacterial siderophore pathways. ISME J. 12, 320–329 (2018).

    Article  CAS  PubMed  Google Scholar 

  152. Rajniak, J. et al. Biosynthesis of redox-active metabolites in response to iron deficiency in plants. Nat. Chem. Biol. 14, 442–450 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  153. Crits-Christoph, A., Bhattacharya, N., Olm, M. R., Song, Y. S. & Banfield, J. F. Transporter genes in biosynthetic gene clusters predict metabolite characteristics and siderophore activity. Genome Res. 31, 239–250 (2021).

    Article  PubMed Central  Google Scholar 

  154. Yeh, H.-H. et al. Resistance gene-guided genome mining: serial promoter exchanges in Aspergillus nidulans reveal the biosynthetic pathway for fellutamide B, a proteasome inhibitor. ACS Chem. Biol. 11, 2275–2284 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  155. Panter, F., Krug, D., Baumann, S. & Müller, R. Self-resistance guided genome mining uncovers new topoisomerase inhibitors from myxobacteria. Chem. Sci. 9, 4898–4908 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  156. Yan, Y. et al. Resistance-gene-directed discovery of a natural-product herbicide with a new mode of action. Nature 559, 415–418 (2018). This article is a prime example of target-based genome mining, leading to the discovery of a novel herbicide from fungi.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  157. Mungan, M. D. et al. ARTS 2.0: feature updates and expansion of the Antibiotic Resistant Target Seeker for comparative genome mining. Nucleic Acids Res. 48, W546–W552 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  158. Nonejuie, P. et al. Application of bacterial cytological profiling to crude natural product extracts reveals the antibacterial arsenal of Bacillus subtilis. J. Antibiot. 69, 353–361 (2016).

    Article  CAS  Google Scholar 

  159. Kurita, K. L., Glassey, E. & Linington, R. G. Integration of high-content screening and untargeted metabolomics for comprehensive functional annotation of natural product libraries. Proc. Natl Acad. Sci. USA 112, 11999–12004 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  160. Aliper, A. et al. Deep learning applications for predicting pharmacological properties of drugs and drug repurposing using transcriptomic data. Mol. Pharm. 13, 2524–2530 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  161. Shang, Y. et al. Biosynthesis, regulation, and domestication of bitterness in cucumber. Science 346, 1084–1088 (2014). This article uses genome-wide association studies to identify the BGC for cucurbitacin in cucumber, which is responsible for a characteristic bitter taste.

    Article  CAS  PubMed  Google Scholar 

  162. Crits-Christoph, A., Diamond, S., Butterfield, C. N., Thomas, B. C. & Banfield, J. F. Novel soil bacteria possess diverse genes for secondary metabolite biosynthesis. Nature 558, 440–444 (2018). This paper identifies a new clade of uncultivated microbes as potent natural product producers, and introduces metatranscriptomics-based co-expression analysis to predict likely functions for some of their BGCs.

    Article  CAS  PubMed  Google Scholar 

  163. Oyserman, B. O., Medema, M. H. & Raaijmakers, J. M. Road MAPs to engineer host microbiomes. Curr. Opin. Microbiol. 43, 46–54 (2018).

    Article  CAS  PubMed  Google Scholar 

  164. Huang, A. C. et al. A specialized metabolic network selectively modulates Arabidopsis root microbiota. Science 364, eaau6389 (2019).

    Article  CAS  PubMed  Google Scholar 

  165. Chevrette, M. G., Aicheler, F., Kohlbacher, O., Currie, C. R. & Medema, M. H. SANDPUMA: ensemble predictions of nonribosomal peptide chemistry reveal biosynthetic diversity across Actinobacteria. Bioinformatics 33, 3202–3210 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  166. Helfrich, E. J. N. et al. Automated structure prediction of trans-acyltransferase polyketide synthase products. Nat. Chem. Biol. 15, 813–821 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  167. Agrawal, P. & Mohanty, D. A machine-learning-based method for prediction of macrocyclization patterns of polyketides and nonribosomal peptides. Bioinformatics 37, 603–611 (2020).

    Article  Google Scholar 

  168. Dührkop, K., Shen, H., Meusel, M., Rousu, J. & Böcker, S. Searching molecular structure databases with tandem mass spectra using CSI:FingerID. Proc. Natl Acad. Sci. USA 112, 12580–12585 (2015).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  169. van der Hooft, J. J. J., Wandy, J., Barrett, M. P., Burgess, K. E. V. & Rogers, S. Topic modeling for untargeted substructure exploration in metabolomics. Proc. Natl Acad. Sci. USA 113, 13738–13743 (2016).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  170. Rodrigues, T., Reker, D., Schneider, P. & Schneider, G. Counting on natural products for drug design. Nat. Chem. 8, 531–541 (2016).

    Article  CAS  PubMed  Google Scholar 

  171. Reker, D. et al. Revealing the macromolecular targets of complex natural products. Nat. Chem. 6, 1072–1078 (2014). This paper introduces computational methods to predict macromolecular targets for natural products, by comparing fragments of a query metabolite with those found in a training set of metabolites with known targets.

    Article  CAS  PubMed  Google Scholar 

  172. Stokes, J. M. et al. A deep learning approach to antibiotic discovery. Cell 181, 475–483 (2020).

    Article  CAS  PubMed  Google Scholar 

  173. Skinnider, M. A. et al. Comprehensive prediction of secondary metabolite structure and biological activity from microbial genome sequences. Nat. Commun. 11, 6058 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  174. Karim, A. S. et al. In vitro prototyping and rapid optimization of biosynthetic enzymes for cell design. Nat. Chem. Biol. 16, 912–919 (2020).

    Article  CAS  PubMed  Google Scholar 

  175. Zhang, J. J., Tang, X. & Moore, B. S. Genetic platforms for heterologous expression of microbial natural products. Nat. Prod. Rep. 36, 1313–1332 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  176. Huo, L. et al. Heterologous expression of bacterial natural product biosynthetic pathways. Nat. Prod. Rep. 36, 1412–1436 (2019).

    Article  CAS  PubMed  Google Scholar 

  177. Lin, Z., Nielsen, J. & Liu, Z. Bioprospecting through cloning of whole natural product biosynthetic gene clusters. Front. Bioeng. Biotechnol. 8, 526 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  178. Lee, N. C. O., Larionov, V. & Kouprina, N. Highly efficient CRISPR/Cas9-mediated TAR cloning of genes and chromosomal loci from complex genomes in yeast. Nucleic Acids Res. 43, e55 (2015).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  179. Yamanaka, K. et al. Direct cloning and refactoring of a silent lipopeptide biosynthetic gene cluster yields the antibiotic taromycin A. Proc. Natl Acad. Sci. USA 111, 1957–1962 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  180. Fu, J. et al. Full-length RecE enhances linear–linear homologous recombination and facilitates direct cloning for bioprospecting. Nat. Biotechnol. 30, 440–446 (2012).

    Article  CAS  PubMed  Google Scholar 

  181. Enghiad, B. & Zhao, H. Programmable DNA-guided artificial restriction enzymes. ACS Synth. Biol. 6, 752–757 (2017).

    Article  CAS  PubMed  Google Scholar 

  182. Enghiad, B. et al. Cas12a-assisted precise targeted cloning using in vivo Cre-lox recombination. Nat. Commun. 12, 1171 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  183. Shapland, E. B. et al. Low-cost, high-throughput sequencing of DNA assemblies using a highly multiplexed Nextera process. ACS Synth. Biol. 4, 860–866 (2015).

    Article  CAS  PubMed  Google Scholar 

  184. Zhang, J. J., Tang, X., Zhang, M., Nguyen, D. & Moore, B. S. Broad-host-range expression reveals native and host regulatory elements that influence heterologous antibiotic production in Gram-negative bacteria. mBio 8, e01291-17 (2017).

    PubMed  PubMed Central  Google Scholar 

  185. Wang, G. et al. CRAGE enables rapid activation of biosynthetic gene clusters in undomesticated bacteria. Nat. Microbiol. 4, 2498–2510 (2019).

    Article  PubMed  CAS  Google Scholar 

  186. Harvey, C. J. B. et al. HEx: a heterologous expression platform for the discovery of fungal natural products. Sci. Adv. 4, eaar5459 (2018). This paper introduces a streamlined and largely automated workflow for genome mining and gene synthesis-based expression of fungal BGCs; the authors tested 41 different fungal BGCs and detected metabolites for 22 of them.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  187. Casini, A. et al. A pressure test to make 10 molecules in 90 days: external evaluation of methods to engineer biology. J. Am. Chem. Soc. 140, 4302–4316 (2018).

    Article  CAS  PubMed  Google Scholar 

  188. Smanski, M. J. et al. Functional optimization of gene clusters by combinatorial design and assembly. Nat. Biotechnol. 32, 1241–1249 (2014).

    Article  CAS  PubMed  Google Scholar 

  189. Meyer, A. J., Segall-Shapiro, T. H., Glassey, E., Zhang, J. & Voigt, C. A. Escherichia coli ‘Marionette’ strains with 12 highly optimized small-molecule sensors. Nat. Chem. Biol. 15, 196–204 (2019). Together with Smanski et al. (2014), this paper describes innovative methods to fine-tune the expression stoichiometry of synthetically refactored gene clusters (using either combinatorialization or sensor-based control of gene expression), in order to attain functional expression and production of the actual end compound of a pathway of interest.

    Article  CAS  PubMed  Google Scholar 

  190. Proctor, R. H., Hohn, T. M. & McCormick, S. P. Restoration of wild-type virulence to Tri5 disruption mutants of Gibberella zeae via gene reversion and mutant complementation. Microbiology 143, 2583–2591 (1997).

    Article  CAS  PubMed  Google Scholar 

  191. Rubin, B. E. et al. Targeted genome editing of bacteria within microbial communities. Preprint at bioRxiv https://doi.org/10.1101/2020.07.17.209189 (2020).

    Article  Google Scholar 

  192. Lam, K. N. et al. Phage-delivered CRISPR–Cas9 for strain-specific depletion and genomic deletions in the gut microbiota. Preprint at bioRxiv https://doi.org/10.1101/2020.07.09.193847 (2020).

    Article  Google Scholar 

  193. Gurevich, A. et al. Increased diversity of peptidic natural products revealed by modification-tolerant database search of mass spectra. Nat. Microbiol. 3, 319–327 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  194. Reher, R. et al. A convolutional neural network-based approach for the rapid annotation of molecularly diverse natural products. J. Am. Chem. Soc. 142, 4114–4120 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  195. Burns, D. C., Mazzola, E. P. & Reynolds, W. F. The role of computer-assisted structure elucidation (CASE) programs in the structure elucidation of complex natural products. Nat. Prod. Rep. 36, 919–933 (2019).

    Article  CAS  PubMed  Google Scholar 

  196. Inokuma, Y. et al. X-ray analysis on the nanogram to microgram scale using porous complexes. Nature 495, 461–466 (2013).

    Article  CAS  PubMed  Google Scholar 

  197. Danelius, E., Halaby, S., van der Donk, W. A. & Gonen, T. MicroED in natural product and small molecule research. Nat. Prod. Rep. 38, 423–431 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  198. Chu, J. et al. Discovery of MRSA active antibiotics using primary sequence from the human microbiome. Nat. Chem. Biol. 12, 1004–1006 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  199. Chu, J., Vila-Farres, X. & Brady, S. F. Bioactive synthetic-bioinformatic natural product cyclic peptides inspired by nonribosomal peptide synthetase gene clusters from the human microbiome. J. Am. Chem. Soc. 141, 15737–15741 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  200. Chu, J. et al. Synthetic-nioinformatic natural product antibiotics with diverse modes of action. J. Am. Chem. Soc. 142, 14158–14168 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  201. Hudson, G. A., Hooper, A. R., DiCaprio, A. J., Sarlah, D. & Mitchell, D. A. Structure prediction and synthesis of pyridine-based macrocyclic peptide natural products. Org. Lett. 23, 253–256 (2021).

    Article  CAS  PubMed  Google Scholar 

  202. Challis, G. L. & Ravel, J. Coelichelin, a new peptide siderophore encoded by the Streptomyces coelicolor genome: structure prediction from the sequence of its non-ribosomal peptide synthetase. FEMS Microbiol. Lett. 187, 111–114 (2000).

    Article  CAS  PubMed  Google Scholar 

  203. Blin, K., Kim, H. U., Medema, M. H. & Weber, T. Recent development of antiSMASH and other computational approaches to mine secondary metabolite biosynthetic gene clusters. Brief. Bioinform. 20, 1103–1113 (2019).

    Article  CAS  PubMed  Google Scholar 

  204. Medema, M. H. & Fischbach, M. A. Computational approaches to natural product discovery. Nat. Chem. Biol. 11, 639–648 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  205. Kjærbølling, I., Vesth, T. & Andersen, M. R. Resistance gene-directed genome mining of 50 species. mSystems https://doi.org/10.1101/457903 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  206. Zallot, R., Oberg, N. & Gerlt, J. A. The EFI web resource for genomic enzymology tools: leveraging protein, genome, and metagenome databases to discover novel enzymes and metabolic pathways. Biochemistry 58, 4169–4182 (2019).

    Article  CAS  PubMed  Google Scholar 

  207. Usadel, B. et al. Co-expression tools for plant biology: opportunities for hypothesis generation and caveats. Plant. Cell Env. 32, 1633–1651 (2009).

    Article  CAS  Google Scholar 

  208. Serin, E. A. R., Nijveen, H., Hilhorst, H. W. M. & Ligterink, W. Learning from co-expression networks: possibilities and challenges. Front. Plant. Sci. 7, 444 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  209. Langfelder, P. & Horvath, S. WGCNA: an R package for weighted correlation network analysis. BMC Bioinforma. 9, 559 (2008).

    Article  CAS  Google Scholar 

  210. Tzfadia, O. et al. CoExpNetViz: comparative co-expression networks construction and visualization tool. Front. Plant. Sci. 6, 1194 (2015).

    PubMed  Google Scholar 

  211. Gubbens, J. et al. Natural product proteomining, a quantitative proteomics platform, allows rapid discovery of biosynthetic gene clusters for different classes of natural products. Chem. Biol. 21, 707–718 (2014).

    Article  CAS  PubMed  Google Scholar 

  212. Ding, Y. et al. Genetic elucidation of interconnected antibiotic pathways mediating maize innate immunity. Nat. Plants 6, 1375–1388 (2020).

    Article  CAS  PubMed  Google Scholar 

  213. Levin, B. J. et al. A prominent glycyl radical enzyme in human gut microbiomes metabolizes 4-hydroxy-l-proline. Science 355, eaai8386 (2017).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  214. Soldatou, S., Eldjarn, G. H., Huerta-Uribe, A., Rogers, S. & Duncan, K. R. Linking biosynthetic and chemical space to accelerate microbial secondary metabolite discovery. FEMS Microbiol. Lett. 366, fnz142 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  215. Erbilgin, O. et al. MAGI: a method for metabolite annotation and gene integration. ACS Chem. Biol. 14, 704–714 (2019).

    Article  CAS  PubMed  Google Scholar 

  216. Pascal Andreu, V. et al. BiG-MAP: an automated pipeline to profile metabolic gene cluster abundance and expression in microbiomes. Preprint at bioRxiv https://doi.org/10.1101/2020.12.14.422671 (2020).

    Article  Google Scholar 

  217. Kersten, R. D. et al. Bioactivity-guided genome mining reveals the lomaiviticin biosynthetic gene cluster in Salinispora tropica. Chembiochem 14, 955–962 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  218. Mohimani, H. & Pevzner, P. A. Dereplication, sequencing and identification of peptidic natural products: from genome mining to peptidogenomics to spectral networks. Nat. Prod. Rep. 33, 73–86 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  219. Ricart, E. et al. rBAN: retro-biosynthetic analysis of nonribosomal peptides. J. Cheminform. 11, 13 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  220. Blaženović, I., Kind, T., Ji, J. & Fiehn, O. Software tools and approaches for compound identification of LC–MS/MS data in metabolomics. Metabolites 8, 31 (2018).

    Article  PubMed Central  CAS  Google Scholar 

  221. Lo, H.-C. et al. Two separate gene clusters encode the biosynthetic pathway for the meroterpenoids austinol and dehydroaustinol in Aspergillus nidulans. J. Am. Chem. Soc. 134, 4709–4720 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  222. Sanchez, J. F. et al. Genome-based deletion analysis reveals the prenyl xanthone biosynthesis pathway in Aspergillus nidulans. J. Am. Chem. Soc. 133, 4010–4017 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  223. Andersen, M. R. et al. Accurate prediction of secondary metabolite gene clusters in filamentous fungi. Proc. Natl Acad. Sci. USA 110, E99–E107 (2013).

    Article  CAS  PubMed  Google Scholar 

  224. Huang, A. C. et al. Unearthing a sesterterpene biosynthetic repertoire in the Brassicaceae through genome mining reveals convergent evolution. Proc. Natl Acad. Sci. USA 114, E6005–E6014 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  225. Shoguchi, E. et al. A new dinoflagellate genome illuminates a conserved gene cluster involved in sunscreen biosynthesis. Genome Biol. Evol. 13, evaa235 (2021).

    Article  PubMed  CAS  Google Scholar 

  226. Zhao, T. & Schranz, M. E. Network-based microsynteny analysis identifies major differences and genomic outliers in mammalian and angiosperm genomes. Proc. Natl Acad. Sci. USA 116, 2165–2174 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  227. Bok, J. W. et al. Chromatin-level regulation of biosynthetic gene clusters. Nat. Chem. Biol. 5, 462–464 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  228. Yu, N. et al. Delineation of metabolic gene clusters in plant genomes by chromatin signatures. Nucleic Acids Res. 44, 2255–2265 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  229. Lawrence, J. G. & Roth, J. R. Selfish operons: horizontal transfer may drive the evolution of gene clusters. Genetics 143, 1843–1860 (1996).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  230. Ballouz, S., Francis, A. R., Lan, R. & Tanaka, M. M. Conditions for the evolution of gene clusters in bacterial genomes. PLoS Comput. Biol. 6, e1000672 (2010).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  231. McGary, K. L., Slot, J. C. & Rokas, A. Physical linkage of metabolic genes in fungi is an adaptation against the accumulation of toxic intermediate compounds. Proc. Natl Acad. Sci. USA 110, 11481–11486 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  232. Field, B. et al. Formation of plant metabolic gene clusters within dynamic chromosomal regions. Proc. Natl Acad. Sci. USA 108, 16116–16121 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  233. Gluck-Thaler, E. & Slot, J. C. Specialized plant biochemistry drives gene clustering in fungi. ISME J. 12, 1694–1705 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  234. Schorn, M. A. et al. Sequencing rare marine actinomycete genomes reveals high density of unique natural product biosynthetic gene clusters. Microbiology 162, 2075–2086 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  235. van Santen, J. A. et al. The natural products atlas: an open access knowledge base for microbial natural products discovery. ACS Cent. Sci. 5, 1824–1833 (2019).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  236. Skinnider, M. A. & Magarvey, N. A. Statistical reanalysis of natural products reveals increasing chemical diversity. Proc. Natl Acad. Sci. USA 114, E6271–E6272 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  237. Thrash, J. C. Culturing the uncultured: risk versus reward. mSystems 4, e00130–19 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  238. Atanasov, A. G., Zotchev, S. B., Dirsch, V. M., International Natural Product Sciences Taskforce & Supuran, C. T. Natural products in drug discovery: advances and opportunities. Nat. Rev. Drug Discov. 20, 200–216 (2021).

    Article  CAS  PubMed  Google Scholar 

Download references

Acknowledgements

This work was supported by the US National Institutes of Health (NIH) (F32-GM129960 to T.dR. and R01-GM085770 to B.S.M.) and European Research Council Starting Grant 948770-DECIPHER (to M.H.M.). The authors thank members of the Moore and Medema laboratories for helpful discussions.

Author information

Authors and Affiliations

Authors

Contributions

The authors contributed equally to all aspects of the article.

Corresponding author

Correspondence to Bradley S. Moore.

Ethics declarations

Competing interests

M.H.M. is a co-founder of Design Pharmaceuticals and a member of the scientific advisory board of Hexagon Bio. The other authors declare no competing interests.

Additional information

Peer review information

Nature Reviews Genetics thanks C. Gruber and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Glossary

Natural products

Organic compounds originating from living organisms or natural sources, often prized for their medicinal properties or other biological activities of utility to humanity. The term is typically used to refer to products of secondary metabolism, but also includes primary metabolites.

Specialized metabolites

Natural compounds of limited clade-specific or niche-specific distribution, known or presumed to have a specialized role in ecology or physiology.

Secondary metabolites

Metabolites that are not strictly required for growth and development, as opposed to primary metabolites, but are often important for survival of an organism in its environment. In the classical meaning, secondary metabolites do not include proteins or large gene-derived peptides that are not post-translationally modified by enzymes.

Siderophore

A metabolite that binds (chelates) iron ions from the environment and is re-imported back into a cell for iron acquisition. Other ‘metallophores’ bind trace metals such as zinc and copper.

Biosynthetic genes

Genes encoding enzymes that catalyse transformations in a biosynthetic pathway.

Ribosomally synthesized and post-translationally modified peptide

(RiPP). A peptide biosynthesized through the action of tailoring enzymes on a ribosomally translated precursor peptide.

Heterologous expression

Expression of one or more genes originating from one organism in another organism; often used to obtain higher production titres or to independently verify their chemical structure or biological function.

Biosynthetic gene clusters

(BGCs). Sets of genes that are physically co-located on a chromosome and together encode the production, regulation and transport of one or more specific metabolites.

Polyketide synthases

Enzymes involved in the biosynthesis of polyketide metabolites; some form modular assembly lines of multidomain proteins, whereas others act as stand-alone enzymes.

Non-ribosomal peptide synthetase

(NRPS). An enzyme involved in the polymerization of amino acids or other organic acids into peptide metabolites without involvement of the ribosome.

Horizontal gene transfer

Acquisition of genetic material by one organism, originating from another. This is often facilitated by plasmids, viruses or mobile elements.

Profile hidden Markov models

(pHMMs). Computational models, trained on a multiple-sequence alignment of a protein family, used to assess whether proteins are part of (or related to) a family.

Gene cluster families

Families comprising a set of similar biosynthetic gene clusters across strains or species, the members of which are responsible for the production of the same or very similar metabolites.

Heterologous host

An organism different from the source organism of a gene under investigation, usually a model organism with a well-developed genetic toolkit. A heterologous host optimized for a specific biotechnological application such as small-molecule production is sometimes called a ‘chassis’.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Medema, M.H., de Rond, T. & Moore, B.S. Mining genomes to illuminate the specialized chemistry of life. Nat Rev Genet 22, 553–571 (2021). https://doi.org/10.1038/s41576-021-00363-7

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41576-021-00363-7

This article is cited by

Search

Quick links

Nature Briefing: Translational Research

Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

Get what matters in translational research, free to your inbox weekly. Sign up for Nature Briefing: Translational Research