Anelloviruses are circular negative-sense DNA viruses. They have genomes ranging in size from 1.6 to 3.9 kb. Anelloviruses that infect mammals have one large and two or three smaller open reading frames (ORFs) [5, 6, 71]. Gyroviruses (genus Gyrovirus) were assigned to the family Anelloviridae in 2017 [57]. Gyroviruses that have been identified primarily in avian species have at least three large ORFs.

Mammalian anelloviruses have been identified in a broad range of animals of the families Aotidae, Callitrichidae, Canidae, Cercopithecidae, Cricetidae, Didelphidae, Equidae, Felidae, Hominidae, Indriidae, Leporidae, Molossidae, Muridae, Mustelidae, Otariidae, Phocidae, Phyllostomidae, Procyonidae, Suidae, Tupaiidae, Ursidae, and Viverridae [1,2,3,4, 7, 8, 12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27, 29, 32,33,34, 39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56, 61,62,63,64,65, 72,73,74,75,76] (Supplementary Table S1). They have also been found in blood-feeding invertebrates of the families Culicidae [38] and Ixodidae [70] (likely derived from the blood meal from their host, but they do not infect these invertebrates), and in faecal samples of predators such as the South Polar skua (family Stercorariidae) [17] (Supplementary Table S1).

The family Anelloviridae was composed of 14 genera (Table 1). Except for viruses in the genus Gyrovirus, all anelloviruses had previously been classified based on pairwise identity values derived from a global alignment of nucleotide sequences of the ORF1 coding region. Previously a 65% sequence identity threshold was established for species demarcation, whereas a 44% sequence identity threshold was used for genus demarcation classification [5, 6]. The use of global alignment-derived pairwise identity values can result in the deflation of the sequence identity values due to fixed gaps. This becomes problematic, especially when diverse sequences are included in such alignments. Thus, here, we use true pairwise sequence identity values to determine their distribution in order to establish species demarcation thresholds. We analysed the annotated ORF1 coding (complete ORF1) regions of complete anellovirus genome sequences (n = 749) available in the GenBank database (downloaded 10 July 2020). We determined the pairwise identity values for all the ORF1 nucleotide sequences using SDT v1.2 [37]. The plot of the distribution of the pairwise identity values (Fig. 1) revealed a trough at ~69%.

Table 1 List of genera in the family Anelloviridae
Fig. 1
figure 1

Distribution of pairwise identity values of the ORF1 nucleotide sequences of anellovirus genomes available in the GenBank database (n = 749), determined using SDT v1.2 [37]

Using 69% as a species demarcation threshold, of the 75 currently established species, viruses in only 10 species did not fit this criterion, and thus these species were abolished and the corresponding viruses were reassigned to already established species (see Table 2 for abolished species and the reassignment of viruses). The 69% species demarcation threshold has also been used for the classification of the viruses in the genus Gyrovirus, which now has nine new species (see Kraberger et al. [30]).

Table 2 Summary of recent taxonomic changes to previously established species

To establish genus demarcation criteria, since the pairwise identity plot does not give a clear demarcation threshold, we opted for a phylogeny-based approach using the ORF1 amino acid sequences. A dataset of the ORF1 amino acid sequences of a representative member of each species was assembled, and this was then aligned using MAFFT [28]. The alignment was trimmed with TrimAL [11] using the gappyout option, and the alignment was used to infer a maximum-likelihood tree with IQTree [35] with the LG+F+G4 substitution model. The TrimAL alignment contained 361 amino acid sites in the final alignment. The resulting maximum-likelihood phylogenetic tree was rooted at the midpoint and edited in iTOL v4 [31]. Based on the phylogeny (Fig. 2), we established a total of 16 new genera (Table 1).

Fig. 2
figure 2

Maximum-likelihood phylogenetic tree of the ORF1-encoded amino acid sequences of representative member of each species in the family Anelloviridae. The sequences of gyroviruses are not included, as their VP1 is not homologous to ORF1. Numbers at the nodes represent aLRT branch support values. Branches with less than 60% support have been collapsed with TreeGraph2 [60]

We used the Greek alphabet for naming new genera. In the spirit of using ancient alphabets, we adopt the Phoenician alphabet for an additional six new genera in series without using ‘bet’ (to avoid confusion with "beta"). In the future, the following letters of the Phoenician alphabet can be used for new genus names with minimal conflict with current names: yod, lamed, mem, samek, ayin, pe, sade, qop, res, sin, taw.

Furthermore, based on the criteria discussed above, we have established 80 new species (see Table 3). A comprehensive list of the new classification of anelloviruses is provided in Supplementary Table S1. This includes all sequences with sufficient information for their classification available in GenBank on 20 May 2021. The new species (n = 9) associated with the genus Gyrovirus are discussed in detail in Kraberger et al. [30].

Table 3 Summary of the genera, type species, species, and exemplar viruses

We recommend that true pairwise identity determination tools be used for determining anellovirus species assignments. We also recommend the following guideline for determining new species of mammalian viruses within the family Anelloviridae, which aligns with previous recommendations for full-genome pairwise identity-based classification for single-stranded DNA satellite molecules in the family Alphasatellitidae [9] and viruses in the families Circoviridae [57], Geminiviridae [10, 36, 66, 67], Genomoviridae [68], and Smacoviridae [69].

  1. 1.

    If the complete ORF1 coding region nucleotide sequence of a new anellovirus shares >69% pairwise identity with that of any member assigned to a currently classified anellovirus species, the virus belongs to that particular species.

    1. a.

      In the event that the complete ORF1 coding region nucleotide sequence of a new anellovirus has >69% pairwise identity to those of members of more than one anellovirus species, the virus should be considered a member of the species with whose members it shares the highest percentage ORF1 coding region nucleotide sequence pairwise identity.

    2. b.

      In the event that the complete ORF1 coding region nucleotide sequence of a new anellovirus has >69% pairwise identity to that of one or more members assigned to a particular anellovirus species, even if it shares <69% identity with those of the majority of the members assigned to that particular anellovirus species, the virus should nevertheless be considered a member of that particular species.

  2. 2.

    If the complete ORF1 coding region nucleotide sequence of a new anellovirus has <69% pairwise identity to those of all members of currently classified anellovirus species, the virus should be considered a member of a new species.

We anticipate a few more changes to the taxonomy of the family Anelloviridae, especially in the light of diverse new sequences being deposited from various studies and large viral metagenomic projects using high-throughput sequencing approaches. We also would like to highlight that metagenomic sequence-derived genomes can be classified [59].

We would also like to inform the anellovirus research community that the International Committee on Taxonomy of Viruses (ICTV) has ratified the adoption of standardized binomial virus species names, which can be either in Latinized or free-form format [58]. In establishing the nine new species in the genus Gyrovirus, we adopted a binomial “Genus + freeform epithet” species nomenclature (see Kraberger et al. [30]). We plan to adopt the binomial species nomenclature for all species in the family Anelloviridae by the year 2023. Thus, we encourage the community to engage with the ICTV Anelloviridae Study Group to determine the binomial names for current and new species in the family Anelloviridae.