Main text

Plant pathogenic necrotrophic bacteria belonging to the Soft Rot Pectobacteriaceae (SRP) family (Pectobacterium spp. and Dickeya spp., former pectinolytic Erwinia spp.) are a constant threat to global agriculture (Charkowski, 2018; Pérombelon, 2002). In potato, these pathogens cause blackleg in field-growing plants as well as soft rot of potato tubers during storage and transit (van der Wolf & de Boer, 2007). Pectobacterium spp. and Dickeya spp. are recognised among the top 10 most important agricultural bacterial pathogens globally (Mansfield et al., 2012). Recently, P. parmentieri (former name: P. wasabiae) (Pitman et al., 2010) has been recognised as a threat to potato cultivation in Europe (Faure et al., 2016; Pasanen et al., 2013; Suárez et al., 2017). After the first characterisation of this species as a potato pathogen in New Zeeland in 2010 (Pitman et al., 2010), P. parmentieri has since been reported to cause severe blackleg outbreaks in potato-cultivating regions worldwide (Kastelein et al., 2020; Khayi et al., 2016).

The control of SRP bacteria, including emerging potato pathogens (e.g., P. parmentieri), is, so far, ineffective (Czajkowski et al., 2011). Thus, the use of bacterial viruses (other names: bacteriophages or phages) in agriculture has been recognised as an alternative way to control bacterial infections (for review, see: (Jones et al., 2007)). Phage-based biological control has already been proposed to manage diseases caused by several species of SRP bacteria (Adriaenssens et al., 2012; Czajkowski et al., 2013). However, knowledge of lytic bacteriophages infecting P. parmentieri is scarce (Czajkowski, 2016). A new lytic bacteriophage, vB_Ppp_A38 (earlier name: ϕA38), infecting P. parmentieri strains has been isolated in our former studies (Smolarska et al., 2017) and in detail characterised for phenotypic features that may be potentially important in biological control applications against its host in potato. In the proof-of-concept experiments, vB_Ppp_A38 efficiently protected potato tubers against rotting caused by P. parmentieri (Smolarska et al., 2017).

Despite the growing interest in using lytic bacteriophages to control SRP bacteria and, specifically, P. parmentieri, overall, there is still very little information present on phage genomes, including their organization and structure (Czajkowski, 2016). Until now, there is only one complete P. parmentieri phage genome available in the NCBI GenBank database, i.e., the genome of the phage vB_PpaP_PP74 (accession: KY084243) (Kabanova et al., 2018). To our knowledge, no further information is present in the literature about genomes of other bacteriophages, specifically those infecting members of P. parmentieri. Phage genomic data, including high-quality, complete and annotated phage genomes, are of great importance as they may accelerate and facilitate the development of bacteriophage-based biological control procedures (Bardina et al., 2016). For example, such data may be used to modify the phage life cycles and/or their environmental viability, develop new phage delivery strategies and/or to understand better and change their host ranges (Huss and Raman, 2020).

The vB_Ppp_A38 (ϕA38) bacteriophage (Smolarska et al., 2017) was isolated from a soil sample collected in an arable field in the northern part (Pomorskie Province) of Poland, using P. parmentieri strain SCC3193 (Pirhonen et al., 1988) as a primary host. In our previous studies, phage vB_Ppp_A38 has been characterised for its morphological and phenotypic features, including biological control properties. Transmission electron microscopy (TEM) revealed that this bacteriophage belongs to the order Caudovirales, with a Podoviridae capsid morphology (Smolarska et al., 2017).

In this study, to obtain a high phage titre (1014–1015 plaque-forming units (pfu) ml−1) for genomic DNA purification, the phage was enriched in P. parmentieri SCC3193 cultures as described previously (Czajkowski et al., 2013). After enrichment, SCC3193 cells were removed by centrifugation (10,000×g, 20 min), and the subsequent viral suspension (50 ml) was filter-sterilised using a sterile 0.22-μm membrane filter (cellulose acetate, VWR) to remove bacterial debris. The resulting suspension containing phage particles was treated with DNase I (Sigma-Aldrich; final concentration, 0.5 mg ml−1) for 60 min at 37 °C with shaking (80 rpm) to digest residual bacterial DNA. Phage particles were further concentrated by precipitation, as described elsewhere (Czajkowski et al., 2015). Finally, the obtained high-titre phage suspension (1.5–2 ml) was used for phage genomic DNA isolation as described previously (Clokie & Kropinski, 2009).

Phage vB_Ppp_A38 genomic DNA was sequenced using two next-generation sequencing techniques: short paired-end read sequencing was done on MiSeq (Illumina), and long 2D-reads nanopore sequencing was done on MinION (Oxford Nanopore). For MiSeq genomic library preparation, the Nextera XT DNA kit (Illumina) was used. During the machine run, 799,310 paired reads of the target size 2 × 300 were generated. These data were sufficient to obtain mean coverage on final contigs equal to 1309.6 and 1311.2 (Std = 335.3). The MinION sequencing libraries were also prepared according to the manufacturer’s guidelines, using the SQK-NSK007 Nanopore Sequencing Kit (R9) for the Native Barcode procedure. In brief, the isolated genomic DNA was tested for purity, concentration and quality on a TapeStation 2200 (Agilent) and a Quantus Fluorometer (Promega). After this, genomic DNA was fragmented in g-TUBE (Covaris) with a speed of 5000 rpm, cleaned up using 0.4 x AMPure XP beads (Beckman Coulter), repaired and further prepared following the standard manufacturer protocol. The Poretools software (Loman and Quinlan 2014) was used for extracting 2028 reads (passed quality threshold) that were longer than 1000 bp, with a total mean length of 7049.8 bp. These data were sufficient to obtain a mean coverage on final contigs equal to 179.5 and 165.6 (Std = 46.5). For genome assembly of the vB_Ppp_A38 genome, we performed a hybrid approach with the SPAdes software 3.6.2 (Bankevich et al., 2012) in a single run, using error correction and long reads as a gap filler. This procedure produced single contigs of 75,764 bp for vB_Ppp_A38, which were further manually inspected and curated. The putative open reading frames (ORFs) were found using Glimmer (auto gene model) (Delcher et al., 1999) and then manually verified and compared to other known phage sequences. Repetitive genetic elements were searched using Geneious (Kearse et al., 2012). Finally, BLAST (http://blast.ncbi.nlm.nih.gov/Blast.cgi), InterProScan (http://www.ebi.ac.uk/Tools/pfa/iprscan/) and HMMER (phmmer, UniProtKB) (Finn et al., 2011) (http://hmmer.org/) were used to annotate the vB_Ppp_A38 genome. The obtained structural and functional annotations were retested using the automated pipeline of the Institute for Genome Sciences (IGS), University of Maryland School of Medicine Annotation Service, accessed via http://ae.igs.umaryland.edu/cgi/index.cgi, as well as re-analysed and curated using RAST (Rapid Annotation using Subsystem Technology (http://rast.nmpdr.org/) (Aziz et al., 2008).

The genome of vB_Ppp_A38 (ϕA38) (GenBank accession: KY083726) is a linear double-stranded, 75,764-bp-long DNA sequence (Fig. 1). It consists of 97 predicted open reading frames (ORFs: predicted protein-encoding genes [PEGs]) and an average GC content of 48.7%. The predicted average gene length was 732 nucleotides, and 93.7% of the genome consisted of coding sequences. Of the 97 putative PEGs, 17 (17.5%) had assigned functions, another 17 (17.5%) were unclassified with no assigned category, 58 (59.8%) were classified as conserved hypothetical, and 5 (5.2%) PEGs had unknown functions (Supplementary Table 1). The vB_Ppp_A38 genome encodes only one tRNA (position: (5′)63,863–63,780 (3′)), and for the majority of genes (88.7%), the start codon for transcription is ATG, whereas CTG and TTG are start codons in only 10.3 and 1.0% of the genes, respectively. The majority of functions of the vB_Ppp_A38 genes could not be identified via screening for homology with known sequences, probably due to the lack of respective phage annotation data present in the international genome sequence databases.

Fig. 1
figure 1

Genome of the bacteriophage vB_Ppp_A38 (75,764 bp). Structural and functional annotations were obtained from the IGS Annotation Service (Institute for Genome Sciences, University of Maryland School of Medicine automated pipeline http://ae.igs.umaryland.edu/cgi/index.cgi) and RAST (Rapid Annotation using Subsystem Technology, accessed via the internet http://rast.nmpdr.org/). The ORFs coding for proteins involved in DNA/RNA transcription, translation and metabolism are marked in light-blue. The ORFs coding for proteins engaged in bacteriophage particle assembly are marked in yellow, and ORFs coding for enzymes participating in metabolism are marked in red. Arrows indicate the direction of transcription and translation. Only ORFs coding for proteins with homology to known proteins are shown. The figure was generated using SnapGene ver. 2.3.4 (http://www.snapgene.com/)

The BLASTN analysis, based on the whole genome sequences, indicated more than 90% sequence identity between phage vB_Ppp_A38 and N4-like bacteriophages infecting SRP bacteria (phage vB_PatP_CB1, phage vB_PatP_CB3 and phage vB_PatP_CB4). Phylogenomic analysis based on whole-genome sequences of vB_Ppp_A38 and other Pectobacteria N4-like viruses viz. viruses: vB_PatP_CB1 (GenBank accession: KY514264.1), vB_PatP_CB3 (KY514265.1) and vB_PatP_CB4 (KY549659.1) (Buttimer et al., 2018), belonging to the family Schitoviridae (Wittmann et al., 2020), was performed using MEGA X (Kumar et al., 2018). The genome sequence of Escherichia virus N4 (NC_008720.1), belonging to the family Schitoviridae, subfamily Enquatrovirinae, genus Enquatrovirus, was used as a reference (Fig. 2). Phylogenomic analysis was conducted using the Maximum Likelihood method and the Tamura-Nei model (Tamura & Nei, 1993). Initial tree(s) were obtained by employing Neighbor-Join and BioNJ procedures to a matrix of pairwise distances estimated using the Tamura-Nei model and then selecting the tree topology with a superior log-likelihood value. This analysis involved five genome sequences of five bacteriophages. The positions of codons were 1st + 2nd + 3rd + noncoding. In total, there were 80,198 positions in the final dataset. Phylogenomic analysis indicated that vB_Ppp_A38 is closely related to the other Pectobacteria N4-like viruses (Fig. 2) described previously (Buttimer et al., 2018). To examine the global, genome-wide organisation (= synteny amongst large blocks of genome sequences) of phage vB_Ppp_A38, phage vB_PatP_CB1, phage vB_PatP_CB3, phage vB_PatP_CB4 and Escherichia virus N4, multiple genome alignments were performed using the MAUVE software as described elsewhere (Darling et al., 2010). The alignment demonstrated a high synteny level between genomes of vB_Ppp_A38 and other N4-like Pectobacteria viruses (Fig. 3). Likewise, comparative genomics analysis was done using EDGAR (https://edgar3.computational.bio.uni-giessen.de/cgi-bin/edgar_login.cgi) (Blom et al., 2009) to obtain insight into the core genome shared between N4-like Pectobacteria and vB_Ppp_A38 bacteriophage. The core (common) genome of phage vB_Ppp_A38, phage vB_PatP_CB1 and phage vB_PatP_CB3 consisted of 80 genes, whereas only 12, 5 and five genes were specific for phages vB_Ppp_A38 vB_PatP_CB1 and vB_PatP_CB3, respectively (Fig. 4). Also, phage vB_Ppp_A38 shared 20 common genes with Escherichia virus N4 (data not shown).

Fig. 2
figure 2

Phylogenomic analysis was conducted using the Maximum Likelihood method and the Tamura-Nei model (Tamura & Nei, 1993). The tree with the highest log-likelihood (−311,692.79) is displayed. Initial tree(s) for the search were obtained by employing Neighbor-Join and BioNJ procedures to a matrix of pairwise distances estimated using the Tamura-Nei model and then selecting the topology with a superior log-likelihood value. The proportion of sites where at least 1 unambiguous base is present in at least 1 sequence for each descendent clade is shown next to each internal node in the tree. This analysis involved 5 genome sequences of 5 bacteriophages. Codon positions involved 1st + 2nd + 3rd + noncoding. In total, there were 80,198 positions in the final dataset. Analyses were performed in the MEGA X program (https://www.megasoftware.net/) (Kumar et al., 2018)

Fig. 3
figure 3

Genome-wide comparison of the phage vB_Ppp_A38, three N4-like Pectobacteria viruses (vB_PatP_CB1, vB_PatP_CB3 and vB_PatP_CB4) infecting P. atrosepticum (Buttimer et al., 2018) and Escherichia N4 virus (Schito, 1974). The progressive Mauve alignment shows the homologous blocks shared among the analysed bacteriophage genomes. The lines connecting the blocks indicate the corresponding position among the homologous blocks (LCB: locally collinear block) to visualise the gene arrangement. The figure was generated using Mauve–Multiple Genome Alignment (http://darlinglab.org/mauve/mauve.html)

Fig. 4
figure 4

Core genome of vB_Ppp_A38, and N4-like Pectobacteria viruses (vB_PatP_CB1 and vB_PatP_CB3). The analysis was done with EDGAR 3.0 –a software platform for comparative genomics (Blom et al., 2009) accessed via (https://edgar3.computational.bio.uni-giessen.de/cgi-bin/edgar_login.cgi), using complete and annotated phage genomes retrieved from the NCBI GenBank database

To our knowledge, P. parmentieri lytic bacteriophage vB_Ppp_A38 (ϕA38) is the first Pectobacteria N4-like virus from the genus Cbunavirus infecting other hosts than P. atrosepticum. Likewise, this is only the second (after phage vB_PpaP_PP74 (Kabanova et al., 2018)) report of the P. parmentieri-specific lytic bacteriophage that can be used for biological control applications of this pathogen. We believe that this high-quality whole-genome sequence and the associated phylogenomic data will provide new helpful information for fundamental advanced molecular research targeting phage-host interactions. In addition, the information may be used to design better control strategies to protect potato crop from infections caused by the pathogen.