1 Introduction

Large-scale declines have been documented in multiple insect pollinator populations, which raises significant concerns, as pollinators are critical for agricultural production and maintaining healthy ecosystems (LeBuhn et al. 2012; Potts et al. 2010). Many factors contribute to these declines, including pathogens, parasites, pesticide exposure, altered landscapes contributing to inadequate nutrition, and human-driven global climate change, as well as their synergistic effects (Goulson et al. 2015). Viral pathogens are a particularly interesting driver of bee declines, as pathogens have been found to cause symptoms ranging from asymptomatic to lethal (Grozinger and Flenniken 2019), synergize with many other stressors (Di Prisco et al. 2013; Doublet et al. 2015), and readily infect numerous bee species, and thus are transmitted across co-foraging pollinator communities (Tehel et al. 2016). Recent studies have used metagenomic approaches to characterize populations of viruses infecting different bee species, and have identified several “novel” viruses that have previously not been identified in bees (Galbraith et al. 2018; Remnant et al. 2017; Schoonvaere et al. 2018).

One recent metagenomic survey, Galbraith et al. (2018), studied the viromes of bee communities across nine countries spanning five continents, identifying 127 sequences representing previously undescribed viruses in bees. Many corresponded to (+)ssRNA viruses, which are most commonly found in honey bees, and were assigned names based on their similarities to known viral families (Table I). These (+)ssRNA viruses included the Dicistro-like, Noda-like, Seco-like, and Tymo-like viruses. The Galbraith et al. (2018) study also identified other RNA virus types rarely described in bees previously, including (−)ssRNA (bee rhabdovirus) and dsRNA (Partiti-like virus). Additionally, the authors provided the first evidence of circular ssDNA viruses (denoted Circo-like viruses 1 and 2) infecting bees. Furthermore, many of these potential viruses were identified in both Apis mellifera honey bees and co-foraging, non-Apis bee specimens, suggesting the potential for a wide distribution of these viruses across bee communities (Table I).

Table I Novel virus targets for the 2015 NHBS, with abbreviations, from Galbraith et al. 2018. The continents where the viruses were first identified, with US-specific sites, can also be found along with the species in which viruses have been identified

Since the viruses from Galbraith et al. (2018) were identified in bee populations across the five continents assessed, it suggests that these pathogens are broadly distributed and potentially quite prevalent. However, since metagenomics studies often examine only a small number of samples, it is difficult to extrapolate from these studies the prevalence and distribution of these viruses on an ecologically relevant scale. For example, while some viruses were identified within the USA, this survey was not an exhaustive screen, only testing in three of 50 states (Galbraith et al. 2018). A much more detailed sampling effort is required to fully assess virus distribution and the potential impact on pollinator populations.

To track disease associated with losses in managed honey bee colonies within the USA, the US Department of Agriculture (USDA) Animal and Plant Health Inspection Service (APHIS) National Honey Bee Disease Survey (NHBS) was initiated in 2009, and collections and screenings continue annually (Traynor et al. 2016). Samples include bees collected from hundreds of apiaries in locations across the USA, including approximately 25–40 states and the territory of Puerto Rico. Live bee samples are sent to the USDA–Agricultural Research Service (ARS) Beltsville Bee Research Laboratory, where molecular techniques are employed to test for the presence of seven common known bee viruses (Gisder and Genersch 2017). The archived samples from APHIS NHBS serve as an important collection that can be used to monitor the presence of newly identified viruses as well as uncovering the possible historical presence of viruses thought to be recently introduced into the US bee populations. Indeed, the NHBS collection was recently used to document the rapid spread of a second master variant of deformed wing virus (DWV), known as DWV-B or Varroa destructor virus-1, across the USA in only 6 years (Ryabov et al. 2017).

As the Galbraith et al. (2018) study found that the highest number of novel viruses were identified in North America, we utilized the collections from the APHIS NHBS in 2015 to screen for the eight recently identified viruses, providing finer scale information of their prevalence and distribution within the USA.

2 Materials and methods

Samples

At the time of this study, the NHBS from 2015 collected the highest number of samples compared to other years to that date, and was therefore selected for this study; conveniently, this was also the same year as the bee collections for Galbraith et al. (2018). Sample collection is described in Traynor et al. (2016). Briefly, each sample consisted of pools of 50 bees from a single apiary, for a total of 800 samples from 35 states and Puerto Rico. RNA extraction (Qiagen RNEasy, Hilden, Germany) and cDNA synthesis with 2 μg total RNA (U Superscript II reverse transcriptase, Invitrogen, Carlsbad, California, USA) were previously conducted as in Traynor et al. (2016). Prior to our study, cDNA samples were kept at – 20 °C for long-term storage.

Sample pooling

To expedite screening, cDNA from samples collected within the same state were pooled together, with 800 original cDNA samples combined into a total of 68 pools. The pooling strategy consisted of adding 2 μl of original cDNA to each pool, with 10–15 apiaries per pool, and 1–2 pools per state. However, 3 pools were created for California, as this state had a total of 43 apiaries due to the high density of apiaries in the state used for pollination services. Sample ID per pool can be found in the supplemental tables (see Supplementary Table 1 “Samples with Pooling ID”).

Viral detection

Primers used in this study can be found in Table II. These primers were designed to be unique to the replicase regions of the novel putative viruses and were made using Primer3 and confirmed with Primer Blast. These primers were previously utilized in Galbraith et al. (2018), with 7/8 (all but Circo-like 1) confirmed to amplify the correct target sequence via Sanger sequencing at the Genomics Core Facility at Penn State University.

Table II Primers used in qPCR screen

Virus presence was assessed via quantitative PCR (qPCR). The following conditions were used for the qPCR assessment: each well-contained 5 μl PowerUp Syber Green Master mix (Applied Biosystems, Foster City, California, USA), 1 μl each of 10 μM Forward and Reverse Primer, 1 μl H2O, and 2 μl cDNA (approximately 53 ng) for a total of 10 μl per reaction. qPCR was conducted using a 7900HT Fast Real-Time PCR system (Applied Biosystems) under the following parameters: 50°C for 2 min, 95°C for 10 min, then cycle 40 × 95°C for 15 s and 59 °C for 1 min, and then a dissociation stage step for melting curve analysis. When possible, positive controls were included on each plate, consisting of the original cDNA from Galbraith et al. (2018). Negative controls on each plate consisted of a no-cDNA H2O sample. For 7/8 virus targets, R2 efficiency was calculated from 1:10 serial dilutions of positive controls or samples—all R2 values were greater than 0.95 (Supplementary Table “Pool ct values”).

Ct value was used to score the presence or absence of virus within a sample by our defined detection threshold. Detection threshold for virus identification was determined by the lowest measured Ct in an H2O (negative control) sample, corresponding to a Ct of 30.71. A sample with a Ct below 30.71 was considered a “Positive” identification for the novel virus, and above as “Negative” (for Ct values for positive and negative controls, see Supplementary Tables). Additionally, melting curve temperatures were measured and Sanger sequencing was used for confirmation of target amplification and exclude samples with possible off-target amplification due to primer–dimers or other artifacts. In the case of Circo-like 2 and Tymo-like virus, melting temperatures were difficult to distinguish between samples and negative controls. While it is not standard protocol to report on data that similarly amplified in the negative control, we chose to report the potential presence of these viruses, as positive controls were sequence confirmed (and had much lower Ct values, 21 and 10), and detection in NHBS samples was low, 1 or 5 locations, respectively. Nevertheless, these results should be interpreted with caution.

3 Results

The novel viruses in this study showed extreme differences in frequency, ranging from no detection to positive identification in nearly every location tested (Table III, Figure 1). Two viruses were the most common: bee rhabdovirus (BRV) and Partiti-like virus (PLV) were identified in 33/36 and 28/36 locations, respectively. Tymo-like virus (TLV) was detected in five states, and Seco-like virus (SLV) was detected in two. The two Circo-like viruses (CLV1 and CLV2) were detected in one state each. Two viruses, Noda-like virus (NLV) and Dicistro-like virus (DLV), were not positively identified in any state. Ct values can be found in Table III and Supplemental Table “Pool Ct values”.

Table III Levels and prevalence of newly identified viruses from pooled samples from the 2015 NHBS. The number of pooled samples showing any amplification and amplification greater than the negative control threshold are shown. The corresponding number of states/territories meeting viral presence thresholds can also be found, as some states had 2–3 pooled samples. Individual Ct values can be found in Supplementary Table “Pool Ct values”
Figure 1.
figure 1

Average infection levels of newly identified viruses across the USA. Bold lines indicate average, with top and bottom of boxes indicating the first and third median values. Bars indicate SE, and outliers indicated with circles. The horizontal line indicates the presence threshold, 30.71, determined by the negative control. Individual Ct values can be found in Supplementary Table “Pool Ct values”. Plots were generated using the ggplot function in RStudio (R Core Team 2017).

To identify whether the pooling strategy distorted signal from low occurrence viruses, the original, un-pooled cDNA was assessed for all viruses except for BRV and PLV, which were already easily identified across the USA, and any additional screen at the apiary level would likely demonstrate a highly similar dispersal. These apiary level experiments revealed fewer locations with positive identifications (Table IV, Supplementary Table “Apiary Ct values”), and identified locations other than those from the pooled sample experiments. In this screen, TLV and SLV were detected in two separate apiaries in Washington. CLV1 was not positively identified in any un-pooled sample. CLV2 was initially detected in pooled cDNA from Florida but was identified here in un-pooled cDNA from Kentucky and Washington samples. NLV continued to be undetected in un-pooled samples, likely indicative of little to no presence in US apiaries. DLV, on the other hand, was positively identified in Washington, despite no detection in samples pooled by state.

Table IV Levels and prevalence of newly identified viruses from un-pooled samples from the 2015 NHBS. BRV and PLV were not run at the apiary level due to their high prevalence in pooled samples. The number of pooled samples showing any amplification and amplification greater than the negative control threshold are shown. The locations showing above-threshold amplification are listed. Individual Ct values can be found in Supplementary Table “Apiary Ct values”

The viruses themselves have different ranges across locations in the pooled NHBS samples (Figure 2). All states assessed and Puerto Rico have had at least one of the eight viruses identified within their state, with most states (n = 21) having two viruses in their state. Florida, with the most positive identifications in the pooled samples, had four viruses positively identified, while Washington was the highest for the un-pooled samples, also having four viruses detected.

Figure 2.
figure 2

Maps of virus presence across the US reveal distribution differences of the eight newly identified viruses. Status of each state is indicated by color, with virus “Present,” “Absent,” or untested. A virus was deemed “Present” if at least one pooled cDNA sampled met the detection threshold of lower than 30.71. Plots were generated using the “fiftystater” package in ggplot in RStudio (Murphy 2019; R Core Team 2017), and edited in Adobe Photoshop to include Puerto Rico.

4 Discussion and conclusion

In this study, we detected the presence of six of eight newly described bee viruses in honey bee apiaries across the USA (Figure 2). In particular, two viruses (BRV and PLV) are broadly distributed, and thus further tests should be conducted to evaluate the impact of infections with these viruses on bees. Four other viruses (TLV, SLV, CLV1, and CLV2) were detected in only a few locations, and thus may represent emerging or transient infections. Additionally, we demonstrated varying distributions of viruses across the states, with some states exhibiting substantially more viral diversity than others. Overall, these results provide further validation of using metagenomics approaches, such as Galbraith et al. 2018, to characterize bee viral populations but also demonstrate the critical need for the types of finer scale mapping studies conducted in this study. Indeed, some of the most prevalent viruses in the metagenomics study were not prevalent in the NHBS samples (e.g., SLV), and vice versa (e.g., BRV). Importantly, this study demonstrates the value of creating and maintaining large-scale collections of bee population samples that are broadly accessible to the research community.

Our results provide critical insights for designing large-scale virus screens in bee populations. First, for low prevalence viruses, screening individual apiaries provided similar limited ability to detect viruses as screening samples pooled from 10 to 15 apiaries. In other words, screening for low-level viruses in samples of 50 bees provided similar information as screening 500–750 bees at a time. Thus, if detailed spatial information is not necessary, pooling provides an economical and efficient method for tracking viral frequency across larger regions and many more samples. In some cases, it is possible that viral RNA in one apiary was diluted by pooling with uninfected apiaries. This may explain why some states, like Oregon in the case of BRV, can score as “negative” for virus but be surrounded by states identified as “positive.” Furthermore, while our threshold approach was efficient, it was also conservative (as some may consider any amplification whatsoever as indicative of the presence of a virus) and may have resulted in false negatives for locations with low levels of viruses.

Secondly, we were able to detect circular ssDNA viruses (CLV1 and CLV2) in this study. Since Traynor et al. (2016) extracted RNA from these samples, and included a DNaseI digestion to remove genomic DNA, detection of ssDNA viruses from isolated RNA is likely indicative of transcription of these viral genomes. This provides evidence that these putative viruses are indeed infecting honey bees and actively transcribing. Thus, while Galbraith et al. (2018) used specialized approaches to obtain both RNA and DNA viruses from their samples, for a simple screening using smaller sequence fragments, using extracted RNA may be sufficient. While these results indicate active infection for the DNA viruses, it would be valuable to conduct additional analyses of virus replication of the other viruses in the study.

BRV and PLV can be found across the USA. In the case of BRV, this result may come with little surprise, as it has been identified globally and in multiple species. Its first published identification came in 2016 in populations of honey bees and Varroa destructor mites from Israel, where it was identified as “Farmington virus” (Levin et al. 2016). Soon after, rhabdoviruses were identified in honey bees and bumble bees in additional locations worldwide, further demonstrating global distribution and cross-species prevalence, and inspiring a rename to “Apis” rhabdovirus then to “Bee” rhabdovirus (Levin et al. 2017; Remnant et al. 2017). Subsequently, BRV and other putative rhabdoviruses in bees have been described in numerous additional studies (Galbraith et al. 2018; Regan et al. 2018; Schoonvaere et al. 2018; Thaduri et al. 2018). Despite identification in bee communities worldwide, no work has yet described the impact of BRV on bee health and decline, nor any co-infection dynamics with other widespread bee viruses, nor the potential role of Varroa destructor as a viral vector.

PLV is an interesting case, as viruses in the Partitiviridae family most often target plants and fungi (Nibert et al. 2014), although recent characterizations of invertebrate RNA viromes have described viruses in the Partitiviridae family in other arthropod viromes (Flynn and Moreau 2019; Garigliany et al. 2017; Shi et al. 2016). Not only is this virus found in bees but is distributed widely across apiaries in the USA. The presence of plant-targeting viruses in pollinators may not be unexpected, as other plant viruses have been detected in honey bees (Li et al. 2014b; although see Miller et al. 2014 and Li et al. 2014a), providing an effective plant virus surveillance method by using high throughput sequencing in honey bees (Roberts et al. 2018). Additional studies are needed to determine whether PLV in bees indicates that bees are serving as vector of a plant pathogen (Whitfield et al. 2015), PLV is a generalist pathogen infecting both kingdoms (Gray and Banerjee 1999), or PLV represents an evolutionary host-switching event from plants to bees (Longdon et al. 2014). Why PLV and BRV are so common yet only characterized by metagenomic sequencing, rather than after observation of symptomatic infection, may be indicative of a mild, asymptomatic infection phenotype; alternatively, severely infected bees may perish from infection before any potential disease is observed. Still, infection dynamics and evolutionary history of these viruses remain undescribed, and their ubiquity merits future characterization.

Viruses in the Tymoviridae and Secoviridae families often target plants, similar to the Partitiviridae family of viruses (Thompson et al. 2014), although TLV and SLV were not nearly as prevalent as PLV in this study. TLV was the third most common virus identified in the pooled samples in this study, but was still only detected in five locations. However, these locations ranged from the South to the Northwest US, indicating a potentially large distribution but low prevalence of the virus. SLV was one of the most prevalent viruses identified in the global metagenomic study (Galbraith et al. 2018), but was rarely found in this study.

The last two novel putative viruses from the Galbraith et al. (2018) study, NLV and DLV, were not very common in the Galbraith et al. (2018) samples, and not identified in the pooled NHBS samples, though DLV was detected in one un-pooled sample. Both of these viral families are known to infect insects (Odegard et al. 2010)—members of the Dicistroviridae family include common viruses targeting bees, such as black queen cell virus and Israeli acute paralysis virus (Bonning 2009). Since infection dynamics of these putative viruses is unknown, it is unclear if the viruses are rare due to a severe infection phenotype, leading to rare pockets of outbreaks, or represent a new pathogen to managed bees, with the potential to spread across the nation or globe (McMahon et al. 2018). Future years of the NHBS can be used to monitor and track the spread of all of these and other viruses.

Different states exhibited clear differences in viral diversity. In the case of the pooled samples, four of the six detected viruses were identified in Florida, whereas most states only had 1–2 detectable viruses. The cause of this is not immediately clear, as all statewide pools included a similar number of apiaries (10 to 15, and each Florida pool covered 11 apiaries), but Southern states might have better overwintering survival, year-round brood production, and additional ectoparasites like the small hive beetle, which may allow for viruses to persist in the environment (Doke et al. 2015; Eyer et al. 2009). Washington state, conversely, had the most detected viruses in the un-pooled samples, and this state differs from Florida dramatically in its climatic conditions. It is possible that factors such as stressful winters and high densities of crops utilizing migratory honey bee pollination services may increase pathogen levels and transmission. Future studies to characterize ecological factors influencing the epidemiology of these novel viruses will provide insight into the explanation behind the prevalence and distribution of these viruses across the USA and globe.

This study demonstrates the tremendous value of large-scale, long-term, accessible collections which can be used to screen for newly identified or emerging pathogens and parasites in honey bees or other insects, and furthermore assess their role in insect population declines (Leather 2018; Sánchez-bayo and Wyckhuys 2019). Our results provide insight into the approaches which can be used to efficiently screen these collections, namely the advantages and disadvantages of pooling samples and the ability to identify both RNA and DNA viruses using a single screen. Importantly, this study has demonstrated that two viruses in particular, BRV and PLV, are widely distributed in US bee populations, but belong to viral families grossly understudied in bees. Future studies should evaluate the transmission and infection dynamics of these viruses within both managed and wild bee populations.