Introduction

Sound travels across different media at varying speed and degradation. For instance, an acoustic signal that vibrates through water will travel faster but degrade quicker than a similar sound traveling through the aerial medium (Miyara 2003; Ballou 2008). When an acoustic signal travels in air, scenario in which this study is focused, it can suffer: attenuation (i.e., loss or reduction of the energy), degradation (i.e., loss of magnitude and temporal parameters of its structure), reflection (i.e., reduction of energy), reverberation (i.e., distortion due to its reflection on physical structures), and masking (i.e., prevention of the detection of the signal given the presence of overlapping acoustic signals with higher energy) (Miyara 2003; Ballou 2008; Rumsey and McCormick 2012). Theoretically, an acoustic signal emitted from a fixed source in an aerial media decays six decibels (dB) each time the distance from the source doubles (Bradbury and Vehrencamp 1998; Wiley 2009) and will achieve a greater transmission distance when emitted in greater amplitude and lower frequency (Knudsen 1933; Piercy et al. 1977; Ballou 2008).

Acoustic signals are usually complex, with changes in their amplitude, frequency, and duration being among the components determining the maximum distance of their transmission (Rumsey and McCormick 2012). However, environmental components such as temperature, humidity, wind speed, and the presence of physical obstacles and environmental noise can also play crucial roles in their transmission (Harris 1966; Piercy et al. 1977; Rumsey and McCormick 2012; Ręk and Kwiatkowska 2016). Specifically, temperature has been shown to associate with the speed of an acoustic signal (Ballou 2008), while high humidity negatively associates with loss of their energy, and thus, acoustic signals are transmitted at a greater distances in humid conditions (Piercy et al. 1977). Regarding the physical obstacles that occur in the path of transmission of an acoustic signal, evidence shows that the distance at which a signal travels is shorter as the number of obstacles increases (e.g., vegetation components; Martens and Michelsen 1981; Slabbekoorn et al. 2007; Slabbekoorn et al. 2002; Albert et al. 2015; Sandoval et al. 2015). Additionally, the shape, size, material, and location of the physical obstacles can shape the way in which the energy of transmission of a given acoustic signal is lost (Slabbekoorn et al. 2007; Sandoval et al. 2015). Finally, environmental noise is a major cause of acoustic signal masking, with higher noise values associated with decreases in the probability of detecting the signal (Piercy et al. 1977; Brumm 2004a; Warren et al. 2006; Slabbekoorn et al. 2007; Arroyo-Solís et al. 2013).

Among animal sound communication, signals emitted through the acoustic channel are highly effective in the wild, even when physical obstacles are present (e.g., trees, shurbs, and even built structures; Mindlin and Laje 2005). Birds are amongst the wildlife groups that have developed complex systems of acoustic signaling, often allowing them to maximize the transmission distance of the sound they produce for long-distance communication, reducing their degradation and attenuation, and avoiding their masking (Bradbury and Vehrencamp 1998; Catchpole and Slater 2003; Boncoraglio and Saino 2007). These signals are mostly vocal (i.e., mainly produced by the syrinx: the anatomical structure of birds that emits sounds) and can be differentiated by their function and composition in: (i) songs and (ii) calls (Brumm and Naguib 2009). Songs are acoustic signals that birds generally use in for reproduction or territoriality context, while calls are acoustic signals mostly used to transmit social and environmental information (Bradbury and Vehrencamp 1998; Catchpole and Slater 2003; Brumm and Naguib 2009).

Despite the fact that birds have developed different types of acoustic signaling methods, there are both natural and anthropogenic noise sources and physical obstacles that can drive the distance at which they can be transmitted. Among the natural noise sources and physical obstacles, the absorption and reflection of signals by vegetation elements, temperature and humidity, the speed of the wind, and sounds emitted by other wildlife species head the list (Harris 1966; Wiley and Richards 1982; Zollinger and Brumm 2015). Regarding noise sources and physical obstacles of anthropic origin, urban systems top the list (Warren et al. 2006; Mockford et al. 2011; Zollinger et al. 2017). Bird communication in urban centers has been widely studied given that not only the physical environment has been completely changed and filled with artificial structures, but also human activities produce continuous and intense noise (Marzluff 2017; Warren et al. 2006). Studies focused on the transmission of avian acoustic signals in urban environments have mainly assessed how environmental noise per se masks avian songs and calls, suggesting that organisms can modify the frequency, amplitude, composition, and duration of signals to avoid noise masking (Slabbekoorn et al. 2002; Slabbekoorn and Peet 2003; Brumm 2004b; Katti and Warren 2004; Bermúdez-Cuamatzin et al. 20092011; Arroyo-Solís et al. 2013; Parris and McCarthy 2013; Slabbekoorn 2013; Sandoval et al. 2015; Grabarczyk and Gill 2020). But urban systems do not only pose a complicated scenario for singing birds in relation to noise pollution, but also by the complex array of artificial structures through which acoustic signals can be trapped or obscured, among other factors (Katti and Warren 2004; Warren et al. 2006; Francis et al. 2009; Mockford et al. 2011; Sih et al. 2011; Francis 2015; Proppe et al. 2013).

In this study we performed a field experiment to evaluate the loss of energy of recorded acoustic signals of a singing bird (Clay-colored Thrush – Turdus grayi) by recovering the signal at increasing radial distances from fixed emission points in the city of Xalapa (southeast Mexico). Considering that acoustic signals suffer degradation and progressive attenuation with increasing distances to the emission sources (Mockford et al. 2011; Sabatini et al. 2011; Sandoval et al. 2015; Ręk and Kwiatkowska 2016), we expected that, as the signals were controlled in amplitude, the recovered recordings would lose energy from the emission source gradually with distance, unless running into a physical obstacle (Warren et al. 2006; Mockford et al. 2011). Thus, we also quantified the amount and type of physical obstacles (i.e., built infrastructure, vegetation elements) that were in the linear path between the fixed emission source and the receiving recorder. Given the nature of urban environments we predicted that our results would be molded by the presence and abundance of physical obstacles, with thresholds close to the source, but in open space scenarios, with few or null obstacles, the emitted signal would be recovered at larger distances.

Methods

Study area

This study was performed in the city of Xalapa (19°32′2” N, 96°55′8” W), state capital of Veracruz, Mexico (Fig. 1). Xalapa is a small-to-medium sized city with a population of approximately 500,000 inhabitants (INEGI 2010). The urban continuum of Xalapa has an extension of 64 km2 across a 500 m elevation gradient (1100–1600 masl) and is characterized by having an important vegetation cover (~ 40% of the urban continuum; Falfán et al. 2018). Given the greenness of the city, we consider it to be an ideal scenario to test the relationship between the transmission of sound, specifically a set of recorded vocalizations of the Clay-colored Thrush, in an urban area.

Fig. 1
figure 1

Map of study area (Xalapa, Mexico) depicting the location of the sound emission and re-recording sites

Emitted acoustic signal

The acoustic signal we used in this study was composed of typical elements of the singing repertoire of the Clay-colored Thrush, considering their frequency in our recordings, as well structure and length differences among them, and additional technical components (e.g., energy/noise ratio). This insectivorous-frugivorous species is generally found in open spaces, forest edges, and urban areas. We chose this species for this experiment given that it has an extensive vocal repertoire (consisting of more than 91 different syllable types) and is fairly common in urbanized landscapes within its geographical distribution, which ranges from southern Texas (USA) to the northern section of the Sierra Nevada de Santa Marta (Colombia). During its breeding season (February–August), the species uses most of its repertoire (Vargas-Castro et al. 2012; Collar 2019).

The acoustic signal used for this field experiment was tailored with recordings of the species in Xalapa and surrounding areas (i.e., 112 vocalizations of eight individuals; recorded by FG-G and deposited in the Sounds Library of the Birds of Mexico; http://www1.inecol.edu.mx/sonidos/menu.htm; codes: 100411_01_Turdus grayi, 100430_00_Turdus grayi, 130430_00_ Turdus grayi Coatepec, 140509_00_ Turdus grayi, (5) 160507_0846; (6) 160507_0849, (7) 160507_0850, and (8) 160507_0851). These recordings were performed with a Telinga Pro-5 parabolic and a Sony PCM-D50 digital recorder (sample rate: 44.1 kHz; resolution: 16 bits; file format: WAV). From the total of assessed vocalizations, we selected five male songs, from different recordings, containing 22 different notes, as well as a call, with which we created a standardized sequence of sounds. For the selection of acoustic components for the standardized sequence of notes, we used the following criteria: (1) the signal energy / noise ratio, (2) the representativeness of notes as part of the repertoire of the thrush in our recordings, (3) differences in the composition of the songs (number and types of notes), and (4) differences in the duration of the sequence (Fig. 2). Once the isolated elements were selected, they were set continuously in a single track with 3-s intervals between juxtaposed elements. Finally, the acoustic signal was normalized to an amplitude of -1 dB in relation to the maximum amplitude value of the track (0 dB) and saved in WAV format (44.1 kHz, 16 bits) with the aim of avoiding distortions given by the compression of the information (Crowder 2015). In summary, the final acoustic signal was composed of a sequence of five songs and two calls, separated by silence intervals of 3-s that was played five times with 5 s silence intervals. We broadcast the signal from an active portable field speaker (SME-AFS, 100 Hz-12 kHz frequency response; Saul Mineroff Electronics, Inc.), using an iPod Nano 2nd generation.

Fig. 2
figure 2

Spectrograms of the nine focal notes (from songs: A–H; call = LL) assessed in this study

Experimental field design

The acoustic signal was reproduced in the field and recovered (recorded at varying distances) at 26 sites across Xalapa (Fig. 1). To have a representative sample of the urban conditions of the city, we selected sampling sites taking into account the following criteria: (1) that they were represented among different land uses (i.e., commercial, residential), (2) that they were distributed in regions with varying levels of urbanization intensity, and (3) that they were separated from each other by a minimum distance of 800 m. At each of these sites, we located an emission point at 1.8 m from the ground as well as a site-calibration point where the signal emission energy was assured to be of 80 dB(A) SPL. For this, we used a sound level meter (BK PRECISION 732A) located 1 m away from the emission point and 1 m above the ground. We set the volume of a portable field speaker (SME-AFS, 100 Hz-12 kHz frequency response; Saul Mineroff Electronics, Inc.), used to emit the acoustic signal, so that the 80 dB(A) SPL were recovered by the sound level meter. We chose 80 dB (A) SPL given that congeneric species to the Clay-colored Thrush have been shown to sing at a similar volume (Common Blackbird – Turdus merula; 74–79 dB; Dabelsteen 1981).

The sites where we recovered the recordings at each sampling site were located at six increasingly distant buffers from the each emission point (i.e., 10 m, 20 m, 40 m, 80 m, 120 m, 160 m). At every buffer, we identified possible locations that were public and accessible. Afterwards, we randomly selected one of those sites per buffer as a recording point, for a total of six signal recovering locations per sampling site and a grand total of 156 recordings. We established the distances at which we set the emission and recording points based on a pilot survey we carried out in an open space (i.e., an empty parking lot) of Xalapa, where we did not detect any signal of the emitted track at distances greater than 160 m. Notably, the pilot survey was set based on previous studies that assess the transmission and degradation of acoustic signals (Barker et al. 2009; Sabatini et al. 2011).

Acoustic signal recovery

We carried out field work between June and August of 2016 (07:00–10:30 h), during the bird breeding season, when the thrush is more vocal. We recovered the emitted acoustic signal with a microphone (SENNHEISER MKH-60 P48) connected to a portable digital recorder (TASCAM-DR-44WL). The digital recorder was mounted on a tripod at 1.8 m from the ground for standard procedures, facing the direction of the emission point (0°). We arranged the settings to a recording level of 7 (44.1 kHz sampling rate, 16-bit resolution). Field conditions, measured with a Kestrel 5500 weather meter, were regular for the city (16.3–26 °C; 53–90% humidity), avoiding windy scenarios (maximum wind speed when recording: 1.3 m/s).

Acoustic signal energy measurement

To evaluate the loss of energy of the emitted acoustic signal, we focused on eight different song notes and the one call note. Specifically, we focused on the first appearance of each of the notes and the call in our acoustic signal. To measure the maximum energy in decibels (dB) of the nine focal notes, we used the Raven Pro 1.5, which by default considered a reference measurement of 1 to calculate the dB value (Cornell Bioacoustics Research Program, 2011; Center for Conservation Bioacoustics 2014). Software settings were as follows: Hann type window, 512 point window size and 50% overlap after eliminating low frequency noise by applying a band pass filter of 0.8 kHz to 10 kHz for notes and 0.8 kHz to 20 kHz for the call, manually drawing a box adjusting the beginning and end, as well as the maximum and minimum frequency of each note and the call. This box allowed us to identify the point at which maximum energy was present on the spectrogram (Chariff et al. 2010).

Physical obstacle measurement

To characterize physical obstacles between the emission and recording points, we used high-resolution satellite images. We draw a straight line between the emission point and each of the recording points. At each of those linear transects, we quantified the number and height of built infrastructure and vegetation elements. To characterize the height of built infrastructure and vegetation elements, we focused on the first five elements that represented potential physical obstacles between the emission and recording points. We measured the height of built infrastructure and vegetation elements with the Street View tool of Google Earth Pro.

Statistical analyses

We built generalized additive mixed models (GAMM) to assess the relationship between the distance of recovered acoustic signals and the type, presence, and height of potential physical obstacles with the maximum energy values of the nine focal notes. GAMMs are a variant of generalized linear models (GLMs), but have different error structures and link functions that provide a best fit for different types of variables, also allowing the use of “smoothers” that are non-parametric procedures where the relation curves are not predetermined but estimated based on the data (Crawley 2013).

We first built a GAMM considering maximum energy of focal notes recovered at recording points as dependent variable and the nine selected notes of the acoustic signal at the different re-recording distances as independent variables (including potential interactive effects of the independent variables), considering sampling site as a random factor. Subsequently, we built two GAMMs, one considering maximum energy of focal notes recovered at recording points as the dependent variable and the number and height of potential built obstacles as independent variables (considering sampling site and note as a random factors), and a second one similar to the latter but including the number and height of potential vegetation obstacles as independent variables.

Finally, we ran classification and regression trees (CARTs) to identify the scenarios under which the detected variations of the maximum energy of the recovered signals varied with: (1) the distance of emission and (2) the number and height of potential physical obstacles (both built and vegetation elements). CARTs are analogous to multiple regression models, as they consider the data set in relation to a set of independent variables in a dichotomous and hierarchical way, taking into account critical and successive threshold values (Loh 2011). At each dichotomy of the CART, an independent variable with a specific threshold related to the dependent variable is identified. At the bottom of the CART, average values for the dependent variables are displayed for that specific scenario. We performed all analyses in R (R Core Team 2019).

Results

Our results show that, considering the technical specifications of the used equipment, the acoustic signal emitted in the field was recovered no farther than 80 m from the emitting source. As expected, we found a negative relationship between the maximum energy of the acoustic signals recovered in the field and the distance to the emitting source. It is notable that an important proportion of recordings recovered in the field (90%) came from surveys at 10–40 m from the emitting source (Table 1).

Table 1 Average values for maximum energy (dB) of recovered signals in the field

The GAMM considering the maximum energy of the acoustic signals recovered in the field as dependent variable and the nine selected notes of the acoustic signal at the different re-recording distances as independent variables (including potential interactive effects of the independent variables), showed a significant negative relationship between distance and the signal’s maximum energy, but no relationship with the identity of notes (neither the interaction of both variables; Table 2). Actually, the CART relating the distance from the emission point and the signal’s maximum energy showed two main scenarios: (1) when the distance between the emission and recording point was less than 60 m, the signal’s maximum energy averaged 81 dB and (2) when the distance was greater than 60 m, the signal’s maximum energy averaged 7.7 dB. Each of these scenarios was followed by two more specific ones. For the first case: (1) when the distance was less than 30 m, the signal’s maximum averaged energy was highest (88 dB), (2) when the distances to the emission source ranged between 30 and 60 m, the signal’s maximum averaged energy was 65 dB. For the second case: (1) when the distance was between 60 and 100 m, the signal’s maximum energy averaged of 23 dB and (4) when the distance to the point of emission exceeded 100 m, the signal’s maximum averaged energy was 0 dB (Fig. 3).

Table 2 GAMMs relating distance, note and number and height of potential obstacles (built and vegetation), and maximum energy of focal notes recovered at recording points
Fig. 3
figure 3

CARTs relating hierarchical partitioned relationships between distance (left panel) and number and height of potential obstacles (built and vegetation) with average values for maximum energy (dB) of the recovered signals in the field

The GAMMs considering the maximum energy of the acoustic signals recovered in the field as the dependent variable and the number and height of potential built and vegetation obstacles as independent variables (considering sampling site and note as a random factors) showed negative and significant relationship between the number, height, and interaction of both variables of built and vegetation elements with the signal’s maximum energy (Table 2). The CART relating potential obstacle presence and height and the maximum energy of the acoustic signals recovered in the field showed that the number of built obstacles was the most important variable explaining variance in the dependent variable. Hierarchically, the following most important variable was the number of vegetation obstacles. Four scenarios were identified by the CART in conditions with less than 1.5 built elements: (1) when less than 1.5 vegetation obstacles were present and built obstacle height was less than 7 m, the signal’s maximum averaged energy was 78 dB, (2) when less than 1.5 vegetation obstacles were present and built obstacle height was higher than 7 m, the signal’s maximum averaged energy was 0 dB, (3) when more than 1.5 vegetation obstacles were present and their height was less than 6.8 m, the signal’s maximum averaged energy was 90 dB, and (4) when more than 1.5 vegetation obstacles were present and their height was higher than 6.8 m, the signal’s maximum averaged energy was 14 dB.

Discussion

Acoustic communication requires signals to be transmitted between a sender and a receiver (Balsby et al. 2003). However, naturally, signals degrade and attenuate, losing energy as the receiver moves away from the source (Holland et al. 1998; Lampe et al. 2004; Barker et al. 2009; Mockford et al. 2011; Poesel et al. 2016). Results of this study show that the limit in the detection of the acoustic signals in the city of Xalapa, considering the nature of the used acoustic signal and that of the city, is of 80 m. Specifically, results clearly indicate that there are different scenarios under which signal’s maximum energy differ, with built infrastructure obstacles representing the most limiting element in the transmission of the signal.

Contrary to our expectations, we found no differences in the maximum energy of the acoustic signals recovered in the field for the nine studied notes of the Clay-colored Thrush. This result is clearly different from what has been reported in the past. Previous studies have shown that the energy loss of different types of acoustic signals depends on their spectral and temporal characteristics (Morton 1975; Miyara 2003; Barker et al. 2009; Mockford et al. 2011; Poesel et al. 2016; Ręk and Kwiatkowska 2016; Graham et al. 2017). Not finding differences in the energy of the assessed notes could be explained by the short distances at which we recover most of the information (<40 m). In fact, it has been argued that urban environments have important amounts of reflective surfaces that can limit acoustic communication (Warren et al. 2006). Previous studies focused on other bird species and in non-urban systems have assessed the transmission of acoustic signals. Similar to our pilot study in an open space of Xalapa, results of some studies indicate that acoustic signals can reach up to 160 m (Ręk and Kwiatkowska 2016); yet, other studies have found the distance to be shorter (100 m; Holland et al. 1998). Most examples from non-urban systems differ to the results of this work, which is clearly explained by the nature and size of the physical obstacles that exist between the sender and the receiver. Our results regarding the assessed physical obstacles agree with previous studies that have shown that the number and size of obstacles between an emitter and a receiver are negatively associated with the energy of an acoustic signal (Albert et al. 2015; Martens and Michelsen 1981; Sandoval et al. 2015; Slabbekoorn et al. 2007; Warren et al. 2006).

A novel component of this work was that we were able to identify thresholds associated with the amount and height of physical obstacles. Specifically, 1.5 elements of built infrastructure was the most important identified threshold, basically representing a barrier for sound. The case for vegetation elements had also a threshold of 1.5 elements, but the energy of recovered signals tended to be higher than when built obstacles of >1.5 elements were present. Finally, an interesting finding was that for both types of physical obstacles, a height of ~7 m was identified as a common threshold. Nevertheless, the effect of built obstacles on signal maximum averaged energy were way larger than that of vegetation obstacles, with the signal’s maximum averaged energy being 6.5 to 78 times lower with high built obstacles than with high vegetation ones.

As far as we know, this study is the first to evaluate the degradation of complex acoustic signals in urban systems, considering built and vegetation elements as drivers. We consider that future studies should consider additional variables when assessing the loss of energy of acoustic signals in urban areas, such as attenuation excess, signal-to-noise ratio, blur, reverberation and differential energy loss of frequencies. Additionally, it would be desirable to evaluate urban aspects in addition to noise, such as urbanization intensity and vegetation cover. Likewise, it would be valuable to identify and evaluate the communication strategies and elements used by wildlife to maximize the transmission distance of their acoustic signals, and, therefore, to evaluate different heights of signal emission and reception. Undoubtedly, knowing the function of each one of the signals and their elements will allow understanding the dynamics of animal communication in urban environments (Slabbekoorn et al. 2007; Warren et al. 2006; Brumm and Naguib 2009; Bermúdez-Cuamatzin et al. 2011).