Comparative genomics of the ADA clade within the Nostocales
Introduction
Cyanobacterial harmful algal blooms (CyanoHABs) have become a commonplace ecological dysfunction in fresh and brackish water bodies around the world, and this trend is being exacerbated by cultural eutrophication and climate change (Paerl and Otten, 2013; Taranu et al., 2015l; Huisman et al., 2018). In temperate regions, these blooms are often dominated by filamentous nitrogen-fixing cyanobacteria assigned to the Anabaena, Dolichospermum and Aphanizomenon genera within the order Nostocales (Cirés and Ballot, 2016; Li et al., 2016). This group has become known as the ADA clade (Driscoll et al., 2018; Österholm et al., 2020) in recognition of their membership in a well-defined and differentiated branch in the Nostocales. In recent years, there has been an explosion in the number of sequenced genomes representing ADA clade members (Driscoll et al., 2018; Teikari et al., 2019; Österholm et al., 2020; Dreher et al., 2021). This reflects high research interest in these cyanobacteria, which present major problems to humans and other organisms due to their possible toxicity, production of off-flavors that taint drinking water, and displacement of a healthy and varied phytoplankton population in water bodies (Pearson et al., 2016; Huisman et al., 2018). An understanding of the genomic differences between environmental isolates and the evolutionary processes that drive phenotypic variations is important in developing the best approaches to addressing the mechanisms behind the increasing occurrence of these CyanoHABs.
We recently reported the new sequences of nine complete and seven near-complete genomes of ADA clade members and analyzed these as part of a larger set of 45 available genome sequences to address the taxonomic status of this group of cyanobacteria (Dreher et al., 2021). We recommended that the ADA clade be considered a genus-level group comprised of 10 species (ADA-1 to ADA-10; Fig. 1) on the basis of phylogenomic relationships. Currently, the clade encompasses four genera (Anabaena, Dolichospermum, Aphanizomenon and Cuspidothrix) whose members display an intermixed phylogeny, none of which is monophyletic (Dreher et al., 2021). Here, we extend that study by reporting the genome properties of the new sequences within the context of the entire ADA phylogeny to link the observed variation with potential ecological and evolutionary functions of individual lineages.
Most of the new genome sequences were obtained by direct metagenomic sequencing of DNA from recent CyanoHABs occurring in lakes and reservoirs in the Pacific NW region of the USA (Dreher et al., 2021). By representing consensus genome sequences, these metagenome assembled genomes (MAGs) reflect the most abundant populations present in the extant CyanoHABs and are therefore the chief cyanobacterial components of the natural microbial consortia associated with these HABs (Alvarenga et al., 2017). Importantly, these data are distinct from genome sequences derived from cultures, which are not necessarily established from the most abundant strain present in a HAB and which can accumulate genetic changes during culturing (Wang et al., 2012; Good et al., 2017). Our use of long-read sequencing dramatically decreases the fragmentation of genome assemblies across many contigs during the assembly process and the reliance on binning of contigs to represent complete or near-complete genomes (Yue et al., 2020). The coupled use of improved DNA sequencing technologies with improved assembly and annotation workflows has made MAGs increasingly important in exploring natural biodiversity, expanding the scope of taxonomic relationships in uncultivated lineages, and inferring the metabolic and physiological capacities of ecosystems (Hug et al., 2016; Alvarenga et al., 2017; Chen et al., 2020; Parks et al., 2020).
In this paper, we leverage the recent availability of an expanded set of complete and draft ADA genomes with a clearer phylogenetic picture of the ADA clade (Österholm et al., 2020; Dreher et al., 2021) to investigate three key issues. First, the recognition of multiple species-level branches within the ADA clade raises the question of how these subclades differ in terms of gene content, genome synteny and predicted physiology. Previous comparative studies on ADA genomes have described some clade-specific distribution of genes with physiological relevance (Driscoll et al., 2018; Österholm et al., 2020). Other comparative studies have addressed this question for all cyanobacteria (Shih et al., 2013; Chen et al., 2021) and for different clades within the Nostocales (Teikari et al., 2018), or by considering specific gene categories (Gerdes et al., 2006; Wang et al., 2011; Dittmann et al., 2015). These studies have generally found correlations of gene presence/absence with environmental niche (Chen et al., 2021) and have highlighted the metabolic diversity and versatility encoded within the genomes of the cyanobacteria (Shih et al., 2013). Our comparative genomics investigation of the ADA clade shows that strain and species differentiation is accompanied by substantial differences in gene content and synteny. These differences are described in detail and in the context of other non-ADA clade Nostocales.
Second, a leading reason for interest in ADA cyanobacteria is due to the variable ability of these strains to produce cyanotoxins. The widespread prevalence of toxic CyanoHABs raises the grave public health concern that dense standing stocks of toxic strains could lead to the transfer of cyanotoxin genes to nontoxic strains. Cyanotoxin genes have generally been considered to be ancient, with the contemporary sporadic phylogenetic distribution best explained by selective loss rather than recent horizontal acquisition of intact gene clusters (Rantala et al., 2004; Dittmann et al., 2013). However, horizontal exchange of cyanotoxin gene clusters has been suggested in a few instances, for saxitoxin and cylindrospermopsin genes (Kellmann et al., 2008; Jiang et al., 2012; Dittmann et al., 2013). Here, we use the expanded ADA dataset to further evaluate the possibility that cyanotoxin genes may be laterally transferred, and report the first occurrence of an entire cyanotoxin gene cluster on a plasmid.
Third, the genomes of HAB-forming cyanobacteria have been recognized as being replete with transposase genes and other mobile or repetitive elements that are assumed to contribute to genome plasticity (Lin et al., 2010; Wang et al., 2012; Humbert et al., 2013; Brown et al., 2016; Alvarenga et al., 2017; Driscoll et al., 2018). Repetitive elements are a major cause of the inability to assemble complete genomes (Kunin et al., 2008; Chen et al., 2020), and the resulting draft genomes may not contain accurate representations of the number and diversity of these elements. The increased availability of complete ADA genomes has thus allowed a wider analysis of the mobilome (Frost et al., 2005; Durrant et al., 2020) and its potential role in genome rearrangements that accompany strain evolution within the ADA clade. We investigate the importance of transposases and other mobile elements such as prophages in ADA clade genome evolution.
Section snippets
Genome assemblies
The methodology for sample collection, DNA extraction, library preparation and sequencing of the new genome sequences studied here were recently described (Dreher et al., 2021). Genome assemblies were derived from PacBio Sequel and Illumina HiSeq 3000 reads, assembled using HGAP (Pacific BioSciences, Inc.) and in some cases SPAdes (Bankevich et al., 2012), and then were annotated by the National Center for Biotechnology Information PGAP pipeline (Tatusova et al., 2016). Most of the genomes were
General properties of the new genomes
The general properties of the 16 newly sequenced genomes are outlined in Table 1, and more extensively in Table S1. The AFA_KM1D3, AFA_UKL13 and Ana_54 strains were unialgal cultures, while all other genomes were derived from environmental samples of recent CyanoHABs in Oregon or Washington (USA) that were used to produce metagenome assembled genomes (MAGs)(Dreher et al., 2021). The complete genome of AFA_KM1D3_PB supersedes the draft genome assembly Aphanizomenon flos-aquae 2012/KM1/D3
ADA physiology
The recent additions of multiple new ADA genomes by Österholm et al. (2020) and Dreher et al. (2021) have enabled a clearer picture of the scope and properties of the ADA clade/genus to emerge. The clade includes a surprising range of physiological strategies, although the predominant state is that of a planktonic, nitrogen-fixing photoautotroph reliant on light harvesting by chlorophyll and phycocyanin antenna complexes and a CO2 concentrating mechanism. Most members can opportunistically
Contributions
TWD designed the experimental concept, conducted research, analyzed data and wrote the paper. EWD, RSM and TGO analyzed data and assisted with paper writing.
Declaration of Competing Interest
The authors declare the following financial interests/personal relationships which may be considered as potential competing interests: Dr. Tim Otten owns and operates Bend Genetics, which provides services that include gene-based analyses of CyanoHABs, including detection of toxin genes. Bend Genetics provided no funding for this study. Dr. Otten participated in experiments reported here while he was a postdoctoral researcher in the Dreher laboratory.
Acknowledgements
We thank Aaron Trippe and Brent Kronmiller (OSU Center for Genome Research and Biocomputing) for conducting PacBio Sequel sequencing and for help with HGAP assembly, respectively. This research was supported by the Oregon State University Agricultural Experiment Station, City of Salem (Oregon), US Forest Service, and Eugene Water & Electric Board.
References (96)
- et al.
A review of the phylogeny, ecology and toxin production of bloom-forming Aphanizomenon spp. and related species within the Nostocales (cyanobacteria)
Harmful Algae
(2016) - et al.
Natural product biosynthetic diversity and comparative genomics of the cyanobacteria
Trends Microbiol
(2015) - et al.
Complete genomes derived by directly sequencing freshwater bloom populations emphasize the significance of the genus level ADA clade within the Nostocales
Harmful Algae
(2021) - et al.
A closely-related clade of globally distributed bloom-forming cyanobacteria within the Nostocales
Harmful Algae
(2018) - et al.
Pushing and pulling in prokaryotic DNA segregation
Cell
(2010) - et al.
Genome mining expands the chemical diversity of the cyanobactin family to include highly modified linear peptides
Chem. Biol.
(2013) - et al.
An overview of diversity, occurrence, genetics and toxin production of bloom-forming Dolichospermum (Anabaena) species
Harmful Algae
(2016) - et al.
The genetics, biosynthesis and regulation of toxic specialized metabolites of cyanobacteria
Harmful Algae
(2016) - et al.
Two alternative starter modules for the non-ribosomal biosynthesis of specific anabaenopeptin variants in Anabaena (Cyanobacteria)
Chem. Biol.
(2010) - et al.
Biochemistry and genetics of taste-and odor-producing cyanobacteria
Harmful Algae
(2016)
Freshwater cyanophages
Virol. Sinica
A completely reimplemented MPI bioinformatics toolkit with a new HHpred server at its core
J. Molec. Biol.
MacSyFinder: a program to mine genomes for molecular systems with an application to CRISPR-Cas systems
PLoS One
Correlation between bacterial G+ C content, genome size and the G+ C content of associated plasmids and bacteriophages
Microbial Genomics
A metagenomic approach to cyanobacterial genomics
Front. Microbiol.
A new method for non-parametric multivariate analysis of variance
Austral. Ecol.
KBase: the United States department of energy systems biology knowledgebase
Nature Biotechnol
PHASTER: a better, faster version of the PHAST phage search tool
Nucl. Acids Res.
Ribosomally synthesized and post-translationally modified peptide natural products: overview and recommendations for a universal nomenclature
Natural Product Rep
SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing
J. Comput. Biol.
IslandViewer 4: expanded prediction of genomic islands for larger-scale datasets
Nucl. Acids Res
Prochlorococcus: the structure and function of collective diversity
Nature Rev. Microbiol.
antiSMASH 5.0: updates to the secondary metabolite genome mining pipeline
Nucl. Acids Res
Structural and functional analysis of the finished genome of the recently isolated toxic Anabaena sp. WA102
BMC Genomics
Phylum-wide comparative genomics unravel the diversity of secondary metabolism in Cyanobacteria
BMC Genomics
Database for bacterial group II introns
Nucl. Acids Res.
GTDB-Tk: a toolkit to classify genomes with the genome taxonomy database
Bioinformatics
Accurate and complete genomes from metagenomes
Genome Res
Comparative genomics reveals insights into cyanobacterial evolution and habitat adaptation
ISME J
Genomic islands and the ecology and evolution of Prochlorococcus
Science
CRISPRCasFinder, an update of CRISRFinder, includes a portable version, enhanced performance and integrates search for Cas proteins
Nucl. Acids Res.
Type IV pili: dynamics, biophysics and functional consequences
Nature Rev. Microbiol.
progressiveMauve: multiple genome alignment with gene gain, loss and rearrangement
PLoS One
Cyanobacterial toxins: biosynthetic routes and evolutionary roots
FEMS Microbiol. Rev.
A bioinformatic analysis of integrative mobile genetic elements highlights their role in bacterial adaptation
Cell Host Microbe
Selectivity in mammalian extinction risk and threat types: a new measure of phylogenetic signal strength in binary traits
Conserv. Biol. J. Soc. Conserv. Biol.
Mobile genetic elements: the agents of open source evolution
Nature Rev. Microbiol.
Marine viruses and their biogeochemical and ecological effects
Nature
Comparative genomics of NAD biosynthesis in cyanobacteria
J. Bacteriol.
clinker & clustermap.js: Automatic generation of gene cluster comparison figures
Bioinformatics
The dynamics of molecular evolution over 60,000 generations
Nature
Significance of bacteriophages for controlling bacterioplankton growth in a mesotrophic lake
Appl. Environ. Microbiol.
Role of pathogenicity island-associated integrases in the genome plasticity of uropathogenic Escherichia coli strain 536
Molec. Microbiol.
Lysogeny in nature: mechanisms, impact and ecology of temperate phages
ISME J
A new view of the tree of life
Nature Microbiol
Cyanobacterial blooms
Nat. Rev. Microbiol.
A tribute to disorder in the genome of the bloom-forming freshwater cyanobacterium Microcystis aeruginosa
PLoS One
Cited by (9)
Intensification of harmful cyanobacterial blooms in a eutrophic, temperate lake caused by nitrogen, temperature, and CO<inf>2</inf>
2024, Science of the Total Environment7-epi-cylindrospermopsin and microcystin producers among diverse Anabaena/Dolichospermum/Aphanizomenon CyanoHABs in Oregon, USA
2022, Harmful AlgaeCitation Excerpt :JUN01 was the cause of toxicosis and death in 32 cattle, and that MC was present at 3000 ng/mL in the waters of Junipers Reservoir (Dreher et al., 2019; Table 7). The current study has experimentally confirmed MC production (Tables 2, 3) in Dolichospermum strains from the US Pacific NW whose genomes had previously revealed the presence of mcy genes (Dreher et al., 2021b). Dolichospermum sp.
- 1
Current affiliation: Bend Genetics, LLC