1 Introduction

This article reports on epilithic bacterial biofilm composition, functions, and changes along Río de la Sabana as it flows north to south east of Acapulco, Mexico. This river and its floodplain provide high biodiversity habitats, but these are now affected by rapidly evolving urban encroachments (Arriaga-Cabrera et al. 2008). As previously reported, these encroachments have not only accelerated environmental degradation and water pollution along the river and its coastal lagoon (CONAGUA 2012; Pineda-Mora et al. 2018), but also increased flood-induced health vulnerabilities in low-lying settlements across and adjacent to the Río de la Sabana Valley (CONAGUA 2012; Rodríguez-Herrera et al. 2013).

In general, biofilms are densely populated and self-stabilizing microbiomes (Besemer 2015; Peipoch et al. 2015; Findlay and Battin 2016; Nicholls and Crompton 2017) involving algae, diatoms, fungi, bacteria, and protozoa, all enveloped in an extracellular polymeric substance matrix (EPS) on biotic and/or abiotic surfaces (Nadell et al. 2009; Villeneuve et al. 2013; Nadell et al. 2016). As such, biofilms undoubtedly affect a variety of processes and functions by way of organic matter production and decomposition, nutrient uptake and cycling, and contaminant absorption and processing (Findlay and Battin 2016; Shahsavari et al. 2017; Guasch et al. 2017; Banerjee et al. 2018). In addition, several studies have stressed that microbial biofilm compositions are subject to physical, chemical, and biological changes in water flow and quality (Pesce et al. 2010; Montuelle et al. 2010; Tlili et al. 2011; Baek et al. 2013; Langenheder et al. 2016; Bouchez et al. 2016; Hu et al. 2017). These changes can now be revealed in terms of operational taxonomic units (OTUs) using next generation sequencing (NGS) DNA technologies (Ghiglione et al. 2014; Tan et al. 2015; Pawlowski et al. 2018). With this background, this article reports on changes in bacterial biofilm composition, diversity, and functions as affected by the extent of residential effluent pollution and related seasonal influences along Río de la Sabana. The emphasis is on epilithic biofilms because their bacterial composition would directly be affected by locational changes in water quality, varying flow rates, and desiccating exposure incidences.

2 Materials and Methods

2.1 Study Area and Field Sampling

Río de la Sabana begins in the north-east portion of the Acapulco municipality, with its headwaters reaching 2000 m above sea level. Its main flow channel is 30.7 km long and widens to about 60 m at its confluence with the coastal Tres Palos lagoon (CONAGUA 2012). Its 466 km2 basin area supports permanent flow channels with a cumulative length of 727 km, and a channel density of 1.55 km−1 (Villegas-Romero et al. 2009). The steep and mostly pristine terrain of the upper basin feeds the main flow channels with short-duration runoff. In contrast, the lower part of the terrain widens into the broad and densely populated La Sabana Valley floodplain (Pineda-Mora et al. 2018).

This study was carried out in 2017 during the dry (January) and the rainy (July) season, with samples taken from the basin headwaters to the floodplain estuary (Fig. 1, Table 1) as per the following sampling zonation (Dorigo et al. 2008):

  1. 1.

    Reference Zone (Zone 1; Locations S1, S2): located on the upper part of the sub-basin, meeting national and international standards for the protection of aquatic life, representing oligotrophic water quality conditions

  2. 2.

    Transition Zone (Zone 2; Locations S3, S4): a peri-urban zone located between the upper and mid-low parts of the basin.

  3. 3.

    Pollution Zone (Zone 3; Locations S5, S6): representing a densely populated area and eutrophic water quality conditions.

  4. 4.

    Recovery Zone (Zone 4; Location S7): a confluence area that combines water from densely populated area west of the river, with water also received from the low population area in the east.

Fig. 1
figure 1

Sampling zone and Locations S1 through to S7 along Río de la Sabana, depicting elevation change and Normalized Difference Vegetation Index (NDVI) to signify the extent of mainly residential developments within the local municipal border

Table 1 Location of the Río de la Sabana study sites by zoning and reach

In total, 14 composite biofilm samples were gathered from submerged rock surfaces (Fig. 2): 7 per dry season, and 7 per rainy season using the point contact method (Lyautey et al. 2010). At each location, three rocks of similar type and surface area (about 15 cm in diameter) were selected from a transverse profile approximately 0.3, 0.5, and 0.7 m apart. At least 30 g were collected per sample through scraping the rocks with sterile disposable brushes and using sterile Falcon tubes for collection (50 mL). These samples were stored on ice until ready for centrifugation at 6,000 rpm for 10 min to eliminate excess water. Thereafter, the samples were stored at − 20 °C until processed for DNA extraction.

Fig. 2
figure 2

Examples of biofilm-covered rocks lifted from Location 2 representing oligotrophic upland water conditions (top), and from Location 5 representing eutrophic water conditions near the residential floodplain developments (bottom). Biofilms were found to be thicker, less translucent, and greener along Locations S4 to S7

2.2 DNA Extraction, Sequencing, and Phylogenetic Analysis

Deoxyribonucleic acid (DNA) was extracted from the biofilms according to the methodology proposed by Ceja-Navarro et al. (2010) and Navarro-Noya et al. ( 2013). Sample extractions were done in triplicate. Doing so yielded 42 DNA samples for RNA analysis at Applied Biological Materials Inc. This procedure involved two-stage PCR amplification with gen 16S rDNA V3–V4 specific primers. The genomic library was subsequently generated with an Agilent 2100 Bioanalyzer, followed by qPCR quantification. DNA sequencing was accomplished using a MiSeq 600 cycle flow cell. The resulting Bcl archives were converted into FastQC data. Sequencing quality control was performed with the FastQC tool. The resulting data were processed using the QUIIME v. 1.8.0. program, with OTUs determined using pick_de_novo_otus.py software to a similarity of > 97% against the Greengenes database (Caporaso et al. 2010). Taxonomic assignments (phylum, order, class, family) were subsequently done by means of an rRNA Bayesian classifier of the Ribosomal Data Project, up to a confidence threshold of 80% (Supplementary Data I). The relative abundance of DNA sequences (i.e., paired V3 -V4 16s rRNA sequences) identified as taxonomic units at ≥ 1% are listed in Supplementary Data II. Where possible, these units were assigned as bioremediators, pathogens, aerobes, anaerobes (including facultative anaerobes), N2-fixers, and/or denitrifiers at the family or order level through literature search (Supplementary Data III). The raw nucleotide sequence reads of this study have been deposited in NCBI under BioProject ID PRJNA643497, and can be accessed using the NCBI SRA accession id entries from SAMN15468712 to SAMN15468753 (42 entries: 3 epilithic biofilm samples taken from each of the 7 locations during the dry and rainy season).

2.3 Water Quality Sampling and Analysis

As already reported by Pineda-Mora et al. (2018), water quality was sampled at the same locations during the dry season (January) and rainy season (July) in 2017 (Table 1). Fourteen physicochemical and bacteriological water quality parameters (Table 2, Supplementary Data IV) were measured following established protocols in accordance with Mexican references (NMX) and standardized alternative methods. Temperature, pH, and electrical conductivity were measured in situ. The other parameters were analyzed in a laboratory authorized through Mexican Organization of Accreditation (AC). Temperature, pH, electrical conductivity, ammonium nitrogen, nitrates, nitrites, methylene blue active substances (anionic surfactants), sulfates, and phosphates parameters were measured in triplicate. Dissolved oxygen, biochemical oxygen demand, chemical oxygen demand, total coliforms, and fecal coliforms were determined by way of single measurements per sample.

Table 2 Physicochemical and bacteriological parameters, units, and methods used

2.4 Statistical Analyses

The bacterial relative abundance data, identified by order, class, family, and unclassified, were compiled into spreadsheets. The BiodiversityR package (Kindt and Kindt 2019) in R studio was used to determine microbial diversity indices by location and season: diversity by type (Richness, Kindt and Kindt 2019), by type and abundance (Simpson 1949), and by diversity of dominants (Berger and Parker 1970).

The sums of the relative taxonomic abundance frequencies contributing to bacterial composition, functions, biodiversity indices, and water quality parameters were all compiled by location and season. Locations S1, S2. S3, S4, S5, S6, and S7 where coded 1, 2, 3, 4, 7, 6, and 5, respectively, to reflect the overall extent of residential effluent pollution. Season was coded 0 when dry and 1 when wet. The resulting variable-to-variable association patterns of these sums were (i) factor analyzed (Principal components, 2 factors, Varimax transformation) and (ii) were subjected to multivariate regression analyses to quantify the extent to which bacterial relative OTU abundances, functions, and water quality parameters relate to each other by location and season. The resulting two-factor analyses plots were sectorized by extent of pollution and season so that Sectors I and II refer to least-polluted (S1, S2, S3) and Sectors III and IV refer to most polluted (S4, S5, S6, S7) locations during the wet and dry seasons, respectively.

3 Results

3.1 Bacterial Biofilm Abundances, Phylum Level

The > 1% relative abundances of the DNA-recognized bacterial phylum (p), class (c), order (o), or family (f) in the Río de la Sabana biofilms are presented in Supplementary Data I. Altogether, 11 entries were identified at the phylum level, 27 at the class level, and 68 entries at the family/order level. Of the ≥ 1% entries, 27 were present during the dry and wet seasons, 20 occurred during the dry season only, and 24 occurred during the wet season only.

The bar-charted phylum abundance display in Fig. 3 (i) across all locations and seasons combined, (ii) across all locations by season, and (iii) across the least to most polluted locations by season revealed the following differences: increased pollution promoted the relative abundance on Proteobacteria, Bacteroidetes, Chloroflexi, at the S4-S7 locations but decreased the relative abundances of Actinobacteria, Firmicutes, Cyanobacteria, and Thermi. During the wet season, Proteobacteria and Bacteroidetes further increased their relative dominance at the S4-S7 locations at the expense of Firmicutes and Chloroflexi.

Fig. 3
figure 3

Relative abundance of biofilm bacteria grouped (i) across locations and seasons, (ii) by season across locations, (iii) from least to most polluted locations, by season, i.e., S1 to S3 versus S4 to S7

These compositional phylum changes are further summarized in Fig. 4 by the 2-factor analysis plot that differentiates the relative abundance sums of each phylum as they fall into Sector I (Locations S1-S3; least polluted, wet season), II (Locations S1-S3; least polluted, dry season), III (Locations S4-S7; wet, most polluted), and IV (Locations S4–S7; dry, most polluted). Proteobacteria, Chloroflexi, and Synergistetes clustered along the polluted Sector III to IV transitions. Acidobacteria, Actinobacteria, and Planctomycetes prevailed more in Sector I, while Thermi, Firmicutes, and Cyanobacteria prevailed more in Sector II.

Fig. 4
figure 4

Factor analysis of relative abundance (%) numbers for each bacterial phylum in 2 × 2 matrix format: left (Locations S1–S3; italic), right (Locations S4–S7); top: dry season: bottom: wet season (bold)

3.2 Bacterial Biofilm Abundances, Class/Family Levels

The relative abundance numbers in Tables 3 and 4, and the bar charts compiled as Supplementary Data I show how the bacterial orders and families were distributed by Sectors I to IV from the least to most polluted locations by season. Among these families, Comamonadaceae (motile aerobic Betaproteobacteria), Rhodobacteraceae (chemo-organo and phototrophic Alphaproteobacteria), Sphingomonadaceae (aerobic chemo-heterotrophic Alphaproteobacteria), and Moraxellaceae (saprophytic Gammaproteobacteria) can be seen to be the most dominant (Fig. 5). However, only Comamonadaceae and Rhodobacteraceae were present at ≥ 1% at all seven locations and both seasons (Table 3). DNA sequences associated with Moraxellaceae ≥ 1% were absent at S2 during the wet season, and absent at S1, S2, S3 during the dry season.

Table 3 Relative abundance of biofilm bacteria occurring at ≥ 1% persisting during the dry and wet seasons (along Río de la Sabana, by location (S1 to S7). Taxonomic units refer to order (o), or family (f)
Table 4 Relative abundance of biofilm bacteria occurring at ≥ 1% (i) only during the dry (top) or (ii) only during the wet (bottom) season along Río de la Sabana, by location (S1 to S7). Taxonomic units refer to phylum (p), order (o), class (c), or family (f)
Fig. 5
figure 5

Most frequent bacterial order and family occurrences within Río de la Sabana biofilms: relative abundances averaged across S1 to S7 during the dry (left) and wet (right) season

Most of the other families occurred at < 10% abundances, but their relative abundances were strongly affected by pollution and season, and somewhat related to their sector I to IV phylum assignments as apparent from Fig.4 and Tables 3 and 4. For example, classes and families associated with Acidobacteria, Cyanobacteria, and Planctomycetes occurred for the most part in Sector I and II (least polluted during the wet or dry seasons, respectively). In detail, the phototrophic Acaryochloridaceae, Chamasiphomnaceae, Chroococcales, Phormidiaceae, Pseudanabaenales, Stramenopiles, and Xenococcaceae members of the Cyanobacteria phylum occurred with relative abundances ≥ 1% only at the upland S1 to S4 locations (Sectors I and II), and more so during the dry than the wet season (Table 3, Fig. 5).

Factor analyzing the Tables 3 and 4 entries by extent of pollution and season revealed three clusters in the resulting 2-factor, 4-Sector plot in Fig. 6: one cluster in Sector I (least polluted, wet season), one in Sector II (least polluted, dry season), and one straddling Sectors III and IV (most polluted, with entries somewhat influenced by dry and wet season). In this, residential pollution or lack thereof also impacted the relative abundances of the bacterial OTUs. In detail, Alpha- and Betaproteobacteria occurred in all four sectors; Gammaproteobacteria were exclusively found in Sectors III and IV, while Deltaproteobacteria were considerably less abundant (Fig. 6; Supplementary Data I). While Proteobacteria dominated in Sectors III and to IV, some were also be associated with Sectors I and II, especially soil-associated Rhizobiales, Hyphomicrobiaceae, Phyllobacteriaceae, Bradyrhizobiaceae, Oxalobacteraceae, Acetobacteraceae, Phyllobacteriaceae, and Syntrophobacteraceae.

Fig. 6
figure 6

Factor-analyzed pattern for correlational bacterial family and order abundance pattern, colored by phylum association, with additional reference to α, β, γ, and δ Proteobacteria orders

Families listed under Thermi (Trueperaceae, Deinococcaceae, Thermaceae, Erysipelotrichaceae) where present at ≥ 1% relative abundance spread across Sectors I, II, and III, while Firmicutes classes and families (i.e., Mogibacteriaceae, Clostridiales, Bacillaceae, Christensenellaceae, Peptostreptococcaceae, Ruminococcaceae, Exiguobacteraceae) spread across Sectors I to IV (Fig. 6). In contrast, Bacteriodetes families as represented by Cytophagaceae, Chitinophagaceae, Flavobacteriaceae, and Weeksellaceae were present at ≥ 1% during the wet season only. Sphingomonadaceae were present at ≥ 1% at most locations except at S5 and S6 during the dry season: hence more abundant at the non-polluted locations during the wet season (Table 3). Rhizobiales (N2-fixing microsymbiotic Alphaproteobacteria) were also present during the dry and wet season, but with a strong preference for the less polluted upper Río de la Sabana locations during the dry season.

3.3 Bacterial Biofilm Diversity

The bacterial diversity assessment results based on the Table 3 entries are listed in Table 5, by location and season as coded. Factor analyzing these results identified pollution and season as predictive factors for bacterial richness by type (Richness index, higher during the wet than the dry season), diversity abundance (Simpson index, increasing with decreasing pollution level), and dominance (Berger index, increasing with increasing pollution level). Expressed in regression terms, the resulting Factor 1 scores were related to pollution extent, and the Factor 2 scores related to season, as follows:

$$ \mathrm{Factor}\ 1\ \mathrm{scores}\ \mathrm{related}\ \mathrm{to}\ \mathrm{pollution}=-1.566+0.391\ \mathrm{Location}\ \mathrm{code};{R}^2=0.635 $$
(1)
$$ \mathrm{Factor}\ 2\ \mathrm{scores}\ \mathrm{related}\ \mathrm{to}\ \mathrm{season}\kern1.25em =-0.948+1.897\ \mathrm{Season}\ \mathrm{code};{R}^2=0.932 $$
(2)
Table 5 Compilation of diversity indices pertaining to the bacterial composition of Río de la Sabana biofilms, by location and sampling season

3.4 Bacterial Biofilm Functions

The results of summing the relative abundances of bacterial families, classes, orders, or phylum levels according to their presumed bioremediation, pathogen, aerobic, anaerobic, N2-fixing, and denitrifying functions (see Supplementary Data III) are presented Fig. 7, by season and location. These plots reveal higher relative abundances of bioremediators, pathogens, anaerobes, and denitrifiers at Locations S4 to S6. In contrast, aerobes and N2-fixers were nearly evenly distributed across the S1 to S7 locations, and registered low relative abundances during the dry and wet seasons.

Fig. 7
figure 7

Relative abundance of biofilm taxa pertaining to bioremediators, pathogens, aerobes, anaerobes, denitrifiers, and N2-fixers along Río de la Sabana, by location and season

3.5 Water Quality Related to Microbial Biofilm Diversity and Functions

Factor analyzing the water quality measures as listed in Supplementary Data IV in combination with the bacterial diversity results in Table 5 and the relative abundances by bacterial functions as plotted in Fig. 7 led to the following observations by way of Fig. 8:

  1. 1.

    High fecal coliform counts, high electrical conductivities, and elevated temperatures were—for the most part—associated with the polluted locations.

  2. 2.

    These locations were also associated with high pathogen and bioremediator occurrences, which—in turn—registered elevated biological and chemical oxygen demands (BOD, COD), elevated NH3, NO2, NO3, SO42− and PO43− concentrations, and low DO concentrations.

  3. 3.

    There were also seasonal differences: pH as well as NO2, NO3, NH3 and SO42− concentrations scored higher during the dry season.

  4. 4.

    The bacterial Richness index fell on Sector I, meaning that the fresh-water locations had the highest variety of OTUs during the wet season. In comparison, OTU relative abundances as per the Simpson index fell on Sector II, meaning that the relative OTU abundance numbers varied the most within the freshwater locations during the dry season. In contrast, the diversity index for dominance (Berger) fell into Sector III in association with increased effluent pollution and increased abundances of biofilm-resident denitrifiers, anaerobes, aerobes, bioremediators, and pathogens. These location- and season-specific diversity changes can also be discerned through inspecting the Supplementary Data I OTU bar charts, and most clearly so by focusing on the OTU class compilation.

  5. 5.

    Aerobes prevailed more consistently during the dry than the wet season (Sector IV), anaerobes and denitrifiers dominated during the wet season (Sector III), while N2-fixers prevailed in Sector II.

Fig. 8
figure 8

Factor analysis plot combining the general association patterns between (i) relative bacterial biofilm taxa abundances pertaining to bioremediators, pathogens, aerobes, anaerobes, denitrifiers and N2-fixers (white white squares), (ii) water-quality parameters (Table 2; colored dots), and (iii) the bacterial diversity indices (black circles), by Sectors I to IV, with each signifying one of the four dry and wet season combinations pertaining to low and high pollution extent, as in Fig. 4

By way of regression analysis, BOD and COD varied by location and season as follows:

$$ {\log}_{10}\mathrm{BOD}=\left(1.33\pm 0.07\right)+\left(0.14\pm 0.01\right)\ \mathrm{Location}-\left(0.69\pm 0.06\right)\ \mathrm{Season},{R}^2=0.958,\mathrm{RMSE}=0.10 $$
(3)
$$ {\log}_{10}\mathrm{COD}=\left(1.51\pm 0.08\right)+\left(0.15\pm 0.02\right)\ \mathrm{Location}-\left(0.58\pm 0.07\right)\ \mathrm{Season},{R}^2=0.938,\mathrm{RMSE}=0.12 $$
(4)

where locations and seasons were coded again as per Table 5. Correspondingly, dissolved oxygen levels decreased with increased biological and chemical oxygen demand, and this decrease was more pronounced during the wet season, i.e.,

$$ \mathrm{DO}=\left(15.7\pm 0.9\right)-\left(5.6\pm 0.4\right)\ {\log}_{10}\mathrm{COD}-\left(2.7\pm 0.4\right)\ \mathrm{Season},{R}^2=0.940,\mathrm{RMSE}=0.53 $$
(5)

Note that the relative abundance values for the aerobic and anaerobic biofilm families did not enter this formulation in a significant way. This likely related to the general expectation that aerobes and anaerobes function best when and where DO remained steady at high or low levels, respectively.

In reference to seasonal water quality changes, anaerobes including denitrifiers were prevalent under neutral pH condition. In contrast, N2-fixers and organic-matter consuming aerobes were associated with increasing pH levels likely due to (i) converting N2 to NH3 and (ii) heterotrophic release of CO2 from organic acid groups. Relating the reported pH values to season, location, and bacterial biofilm functions generated the following equation:

$$ \mathrm{pH}=\left(7.3\pm 0.1\right)+\kern0.37em \left(1.18\pm 0.24\right)\ \mathrm{Bioremediators}+\kern0.37em \left(10.6\pm 1.0\right)\ {\mathrm{N}}_2-\mathrm{fixers}-\left(0.54\pm 0.08\right)\ \mathrm{Season};{R}^2=0.888;\mathrm{RMSE}=0.23 $$
(6)

where N2-fixers and bioremediators were entered by their sum of relative abundance per location and season.

In terms of electrical conductivity, which indexes overall effluent concentrations, the regression analysis produced the following result:

$$ \mathrm{EC}=\left(1130\pm 70\right)-\left(144\pm 15\right)\ \mathrm{DO}-\left(220\pm 55\right)\ \mathrm{Season};{R}^2=0.921;\mathrm{RMSE}=103 $$
(7)

Hence, EC levels were not only lower during the wet season (likely due to higher flow rate and subsequent dilution) but were also lower where DO levels remained high, and where pollution loads were low to absent, i.e., at Locations S1 to S3.

The best-fitted scatterplots for Eqs. 3, 5, 6, and 7 are presented in Fig. 9 to further illustrate how water quality was affected by location and season. In particular, the changes on BOD (and therefore COD as well) were categorically relatable to season, and linearly relatable to location from least to most polluted. In contrast, the changes in DO were both categorically related to location and season, while the changes in pH and EC were—for the most part—categorical by season, and clustered by location.

Fig. 9
figure 9

Actual versus best-fitted values for log10BOD (Eq. 3), DO (Eq. 5), pH (Eq. 6), and EC (Eq. 7) along the seven Río de la Sabana biofilm sampling locations, by season. Dry-season sampling results are placed on gray-tone background

4 Discussion

4.1 Bacterial Changes at the Phylum Level

The overall numerical ranking sequence of phylum abundance across locations and seasons was as follows:

  • Proteobacteria > Bacterioidetes > Firmicutes > Actinobacteria > Chloroflexi > Cyanobacteria > Thermi > Acidobacteria > Planctonmycetes > Verrumicrobia.

Other stream and riverine biofilm studies also report Proteobacteria, Bacterioides, and Firmicutes to be dominant, followed in abundance by Cyanobacteria, Actinobacteria, Chloroflexi, and Verrumicrobia (e.g., Peipoch et al. 2015; Fang et al. 2017; Lin et al. 2019). Remarkably, increasing pollution levels rendered Proteobacteria, Bacterioides, Firmicutes, and Chloroflexi to dominate all the other biofilm bacteria at > 98% relative abundance during the dry and the wet season. In addition, residential effluents rendered Proteobacteria and Bacterioides to become even more dominant during the wet season through diminishing the relative abundances of Firmicutes and Chloroflexi. Among the Proteobacteria, there was a further pollution-induced dry-to-wet seasonal shift towards greater Betaproteobacteria than Gammaproteobacteria dominance. This observation parallels the relative abundance of Proteobacteria in active sludge as reported by Jiang et al. (2008):

Betaproteobacteria 39.8% > Gammaproteobacteria 20.15% > Alphaproteobacteria 6.79% > Deltaproteobacteria 4.85%.

Hence, Alphaproteobacteria would dominate biofilms in low nutrient waters such as those found at the S1, S2, and S3 locations, and heterotrophic Betaproteobacteria and Gammaproteobacteria would dominate and co-dominate polluted waters such as those found at the S4, S5, S6, and S7 locations.

At the less polluted S1 to S3 locations, Actinobacteria and Planctomycetes became more prevalent from the dry to the wet season, while the opposite occurred with Fermicutes and Thermi. These changes were likely due to (i) higher detrital availabilities that would increase Actinobacteria and Planctomycetes activities during the wet season while (ii) Fermicutes and heat-resistant Thermi would be more adapted to persist on rocky surfaces during the dry season when exposed to periods of intermittent desiccation and heat stress (Marxsen et al. 2010).

The low presence of Cyanobacteria at Locations S4 to S7 was likely caused by pollution-enhanced and light-blocking green algae growth, and this would especially be so under continuous flow conditions, as reported by Sabater et al. (2016). In addition, lower Cyanobacteria abundance at the polluted locations may also be related to high molar N:P ratios, as reported by Peipoch et al. (2019). Determining whether this also applies at the Río de la Sabana sampling locations led to the following regression result, at least in a contributing sense:

$$ Cyanobacteria\ relative\ \mathrm{abundance}=\left(0.132\pm 0.012\right)-\left(0.053\pm 0.010\right)\ \mathrm{Season}-\left(0.057\pm 0.008\right)\ \mathrm{Pollution}\ \mathrm{extent}-\left(0.00019\pm 0.00005\right)\ \mathrm{N}/\mathrm{P};{R}^2=0.863 $$
(8)

with Pollution extent and Season coded as in Table 5. Hence, Cyanobacteria abundance was lower during the wet season and at the polluted locations and was also somewhat reduced by increasing molar N:P ratios. These ratios varied from 48 to 420, were significantly lower during the wet season (p = 0.01), but were not significantly affected by location (p = 0.17).

Relating the sum of bacterial abundances at the family, order, or phylum level as listed in Tables 3 and 4 to the water quality measures by location and season did not produce significant regression results, except for Bacteriodetes and Proteobacteria, as follows:

$$ Bacteriodetes=\left(0.40\pm 0.08\right)-\left(0.039\pm 0.010\right)\ \mathrm{pH}-\left(0.010\pm 0.003\right)\ \mathrm{DO};{R}^2=0.683,\mathrm{RMSE}=0.025 $$
(9)
$$ Proteobacteria=0.24\pm 0.06+\left(0.046\pm 0.013\right)\mathrm{Location};{R}^2=0.495,\mathrm{RMSE}=0.10 $$
(10)

where Location is coded as in Table 5. These equations and their best-fitted scatterplots in Fig. 10 indicate that the epilithic biofilm presence of Bacteriodetes decreases with increasing pH and increasing DO levels, while the presence of Proteobacteria increases with increasing effluents of residential pollutants.

Fig. 10
figure 10

Actual versus best-fitted relative abundance sum plots for Bacteroidetes (Eq. 9) and Proteobacteria (Eq.10) by Locations 1 to 7 and dry (white square) versus wet (filled black circle) season

4.2 Bacterial Changes at the Order/Family Level

Of the 68 taxonomic units identified, bioremediators belonging to the Comamonodaceae, Rhodobacteriae, and Moraxellaceae families dominated at locations S5 and S6, likely due to increased effluent/pollution levels. In contrast, biofilm-forming Sphingomonadaceae (Glaeser and Kämpfer 2014) and biofilm-supporting Rhizobiaceae were more abundant along the upland streams (Figs. 2 and 3). Despite high coliform counts at Locations S5 and S6 (Supplementary Data IV), relative DNA sequences symptomatic of Enterobacteriaceae generally remained low at < 1% at all biofilm sampling locations. This would in part be due to low survival and growth of Escherichia coli biofilms on non-living surfaces (Beloin et al. 2008; Abberton et al. 2016), where Comamonodaceae, Rhodobacteriae, Moraxellaceae, and Sphingomonadaceae generally dominate (Widder et al. 2014).

Many of the bacterial families that were present but in low abundance tended to be (i) aerobic at Locations S1 to S3 (e.g., Ellin6075, Ellin6529, Acidimicrobiales, Actinomycetales, Gaiellaceae) and (ii) anaerobic at Locations S5 to S7 (e.g., Bacteroidales, Flavobacteriaceae, Desulfobulbaceae, Chromatiaceae). These prevalence differences would be in line with the corresponding stream-to-river water aeration and pollution-related changes as discussed above in reference to Sectors I to IV. Nevertheless, aerobic families such as Comamonadaceae, Rhodobacteraceae, Rhodocyclaceae, Weeksellaceae, and Xanthomonadaceae would function well at Locations S4 to S7 as aerobic sludge and waste-water treatment agents. Also of note are the contrasting occurrences of Exiguobacteriales (i.e., Bacilli members of the Firmicutes phylum: highly temperature-adaptive heterotrophs, see White et al. ( 2019) and Weeksellaceae (i.e., Flavobacteriales members of the Bacteroidetes phylum). Both are known to be aerobic to facultative intestinal chemo-organotrophs (Wikipedia 2020): Exiguobacteriales were only present at ≥ 1% during the dry season (Table 4), but Weeksellaceae—similar to Moraxellaceae—were mostly present at the S4, S5, and S6 locations during the wet season (Table 2).

The simultaneous biofilm presence of aerobes and anaerobes attest to spatial biofilm self-arrangements: aerobes on the outside, anaerobes on the inside. Furthermore, the biofilm dominance of the aerobic families would likely have contributed—at least in part—to the dry seasonal increases on water pH due to progressive alkalinity-revealing consumption of organic matter (Kayombo et al. 2002; Bernal et al. 2008; Abdullahi et al. 2008). The quantitative extent to which the epilithic biofilms affected water quality from Locations S1 to S7, however, remains unknown since that would depend on, e.g., incoming effluent loads, flow rates, and combined biofilm- and plankton-based organic matter consumption and mineralization (Romaní et al. 2004). A recent study reported an overall 14% biofilm uptake of dissolved organic matter from an agricultural stream, versus 4% from a forest stream (Graeber et al. 2019).

While there are significant correlations between bacterial biofilm diversities, functions and relative phylum, class, order, family abundances by location and/or season (Figs. 4, 6, 8, 9, 10), it is difficult to quantitatively relate specific water quality measures to bacterial biofilm functions. In part, this refers to the fact that biofilms function synergistically as self-stabilizing microbiomes with their own overall input, output, and life-supporting requirements (Peipoch et al. 2015; Toyofuku et al. 2016; Zeglin 2015). To illustrate, one would have expected that the measured changes in water pH and DO would have influenced bacterial abundances at the individual phylum, order, and family levels, but only Bacteriodetes were so correlated. In part, Bacteriodetes may be more sensitive to increasing pH and DO levels (Eq. 9) because of their anaerobic requirements and inability to produce spores. In general, however, biofilm-internal pH and DO values likely did not correlate with biofilm-external pH and DO measurements since the former would be the result of balanced microbial biofilm coexistences. For example, biofilm internal DO levels would result from combined O2 producing and reducing activities, and biofilm internal pH levels would depend on (i) the organic acidity of biofilm structures and cellular exudates, and (ii) the mix of cellular nutrient cation and anion uptake, respectively.

To discern which effects on water quality were brought about by specific family functions by location and season would be difficult due to wide phenotypic variabilities per bacterial family. In this regard, Comamonadaceae represent 29 genera within an overall 16S rRNA gene sequence similarity of 93–97%. Yet, these genera differ by function as aerobic organotrophs, anaerobic denitrifiers, Fe3+-reducers, hydrogen oxidizers, photoautotrophs, photoheterotrophs, and fermenters (Willems 2014a). Rhodobacteraceae are even more diverse, containing 100 genera involving aerobic photo- and chemo-heterotrophs, with some performing photosynthesis under anaerobic conditions. Characteristically, members of this family facilitate sulfur and carbon biogeochemical cycling, often in symbiosis with aquatic micro- and macro-organisms (Pujalte et al. 2014). Moraxellaceae include 5 genera with non-fermentative saprophytic functions, with some species recognized as pathogens (Teixeira and Merquior 2014). Sphingomonadaceae, known for their important roles as bioremediators, involve 13 genera with facultative to anaerobic fermentation functions in soils, freshwater systems, sludge, rhizospheres, and phyllospheres (Glaeser and Kämpfer 2014). Rhizobiales are primarily involved with N2-fixation, with some forming root nodules while others dominate in tap-water biofilms, with some being pathogenic (Potter 2013; Wang et al. 2018), and some being pathogen controlling. Pathogen-controlling species are associated with Actinobacteria (Wink et al. 2017), with Acidimicrobiales, Actinomycetales, and Nocardioidaceae occurring with ≥ 1% abundances at the S1, S2, and S3 locations of this study.

To reveal further details about positive and negative bacterial interactions in addition to those surmised above along Río de la Sabana locations, it would be necessary to conduct multiple location- and season-specific DNA sampling studies. In addition, such an effort should be expanded to include other biofilm constituents, e.g., green algae and diatoms (Kouzuma and Watanabe 2015; Ramanan et al. 2016). The focus on epilithic biofilms could also be expanded to biofilms residing on other mineral surfaces (e.g., sediments) and submerged macrophytes. Also, valuable would be to analyze how dissolved and total elemental concentrations in biofilms relate to corresponding changes in water quality.

5 Conclusions

The above results show how bacterial biofilm composition, functions, and water quality measures varied along Río de la Sabana from the least to most polluted locations and by season. The main effect of residential effluent discharge expressed itself through biofilm thickening caused by green algae growth and a bacterial dominance shift towards heterotrophic Proteobacteria (Comamonadaceae, Moraxellaceae, Rhodobacteraceae) and Bacteroides (Weeksellaceae). This dominance shift towards common wastewater and sludge OTUs was most pronounced during the wet season and occurred on the expense of the  ≥1% relative abundance of most other families. By bacterial biofilm function and water quality associations, Locations S5 and S6 registered the highest levels of bioremediators and pathogens. Water-sampled fecal coliform counts, biological and chemical oxygen demands, electrical conductivity, and NH3, NO3, NO2, PO43− and SO42−, concentrations were also generally highest at these locations but more so during the dry than the wet season. Dissolved oxygen levels registered highest at the least pollution locations (S1 and S2) during the wet and dry season. In contrast, pH remained near neutral during the wet season but increased to alkaline levels during the dry season. Apart from general factor-analyzed location and seasonal association patterns between bacterial composition, diversity, functions and water quality measures, correlational changes in relative abundance for any particular OTU could only be discerned for Bacteriodetes with changing pH and DO levels in sampled stream and river water.