Dichotic listening performance with cochlear-implant simulations of ear asymmetry is consistent with difficulty ignoring clearer speech

Goupell, Matthew J.; Eisenberg, Daniel; DeRoy Milvae, Kristina

doi:10.3758/s13414-021-02244-x

Dichotic listening performance with cochlear-implant simulations of ear asymmetry is consistent with difficulty ignoring clearer speech

Published: 29 March 2021

Volume 83, pages 2083–2101, (2021)
Cite this article

Download PDF

Attention, Perception, & Psychophysics Aims and scope Submit manuscript

Dichotic listening performance with cochlear-implant simulations of ear asymmetry is consistent with difficulty ignoring clearer speech

Download PDF

Matthew J. Goupell¹,
Daniel Eisenberg¹ &
Kristina DeRoy Milvae¹

1037 Accesses
10 Citations
1 Altmetric
Explore all metrics

Abstract

There are an increasing number of bilateral and single-sided-deafness cochlear-implant (CI) users who hope to achieve improved spatial-hearing abilities through access to sound in both ears. It is, however, unclear how speech is processed when inputs are functionally asymmetrical, which may have an impact on spatial-hearing abilities. Therefore, functionally asymmetrical hearing was controlled and parametrically manipulated using a channel vocoder as a CI simulation. In Experiment 1, normal-hearing (NH) listeners performed a dichotic listening task (i.e., selective attention to one ear, ignoring the other) using asymmetrical signal degradation. Spectral resolution varied independently in each ear (4, 8, 16 channels, and unprocessed control). Performance decreased with decreasing resolution in the target ear and increasing resolution in the interferer ear. In Experiment 2, these results were replicated using a divided attention task (attend to both ears, report one after sentence completion) in both NH and bilateral CI listeners, although overall performance was lower than in Experiment 1. In Experiment 3, frequency-to-place mismatch simulated shallow CI insertion depths (0, 3, 6-mm shifts, and unprocessed control). Performance mostly decreased with increasing shift in the target ear and decreasing shift in the interferer ear; however, performance nonmonotonicities occurred. The worst performance occurred when the shift matched across ears, suggesting that pitch similarity increases difficulty. The results show that it is more difficult to attend an ear that is relatively degraded or distorted, which may set spatial-hearing limitations for CI users when trying to attend to a target in complex auditory scenes.

Spectral and Temporal Analysis of Simulated Dead Regions in Cochlear Implants

Article 05 March 2015

Jong Ho Won, Gary L. Jones, … Jay T. Rubinstein

Effect of interaural electrode insertion depth difference and independent band selection on sentence recognition in noise and spatial release from masking in simulated bilateral cochlear implant listening

Article Open access 25 January 2023

Hasna Fathima, Jayashree S. Bhat & Arivudai Nambi Pitchaimuthu

Using Spectral Blurring to Assess Effects of Channel Interaction on Speech-in-Noise Perception with Cochlear Implants

Article Open access 09 June 2020

Tobias Goehring, Julie G. Arenberg & Robert P. Carlyon

Having two ears and two eyes allows humans to precisely encode the spatial location of sound sources and visual objects, respectively. The inputs in a typical sensory system are ideally symmetrical. When asymmetry occurs, such as in cases of asymmetrical hearing loss, there is a vast array of possible interventions that can be provided (e.g., hearing aids and bionic auditory prostheses called cochlear implants [CIs]). Bilateral hearing aids are now commonly dispensed across the life span (Kochkin, 2009). It is recommended that children who are born deaf receive bilateral CIs around 1 year of age and with less than 18 months between implantations so that there is some hope of developing useable binaural hearing pathways and spatial-hearing abilities (Litovsky & Gordon, 2016). The effectiveness of these interventions varies greatly depending on numerous factors (Knudsen et al., 2010; Litovsky et al., 2012). It is unclear, however, how asymmetrical auditory inputs can diminish hearing abilities and how development contributes to these processing problems (Gordon et al., 2015; Kral et al., 2013; Tillein et al., 2016). The purpose of this study is to better understand the effect of functional asymmetry in absence of developmental problems that can occur with clinical populations.

Functionally asymmetrical sound inputs appear to diminish binaural hearing abilities. Binaural hearing is critical for sound localization in the horizontal plane (Wightman & Kistler, 1997) and speech understanding in background noise (Zurek, 1992). Adults that were born with acoustic hearing, lost their hearing as adults, and received bilateral CIs demonstrated improved speech understanding when targets and interferers originated from different spatial locations (they experienced spatial release from masking; Bernstein et al., 2016). Those listeners were argued to be high performers with relatively symmetrical inputs, and bilateral CI listeners with more asymmetrical hearing appeared not to experience the spatial release from masking benefit. Asymmetry partially explains the discrepancy between the Bernstein et al. (2016) study and another study by Goupell, Stakhovskaya, et al. (2018) that used the same speech-on-speech masking tests; the bilateral CI listeners in the latter study experienced interference instead of an unmasking benefit. This latter study also included a subset of listeners with earlier onsets of hearing loss, so developmental effects may have also contributed to the observed interference.

To simplify the speech-on-speech spatial release from masking task to a more controlled scenario, dichotic listening can be used, in which different speech samples are presented to opposite ears. This paradigm eliminates energetic masking (energy from target and interferer are not analyzed in the same auditory filters in the cochlea or peripheral auditory neurons; e.g., Best et al., 2013); the target is presented to one ear and the interferer is presented to the other. Such a situation sets up separate streams of information for each ear. Dichotic listening, where a person selectively attends to one ear and ignores the other, is often thought to be a relatively simple task (Brungart & Simpson, 2002; Cherry, 1953; Gallun et al., 2007b; Goupell et al., 2016; Wood & Cowan, 1995a). A human’s ability to selectively attend to a single ear is so good that it is unlikely that one would hear his or her own name in a stream of speech presented to the unattended ear (Wood & Cowan, 1995b). Stimulus information and task complexity, however, play a role in the typically excellent performance in dichotic listening. Interference from the nontarget ear becomes increasingly common as task demands increase by adding additional interfering sound streams or altering the similarity of target and interferer streams, which is possibly a result of how the listener expends a finite amount of resources to undertake the task (Brungart & Simpson, 2002, 2004, 2007; Gallun et al., 2007a, 2007b). Such an explanation is consistent with the relative difficulty of dichotic listening seen in children compared with adults (Wightman et al., 2003; Wightman & Kistler, 2005; Wightman et al., 2006; Wightman et al., 2010) because children may have fewer resources and/or they allocate them inefficiently (Lutfi et al., 2003; Wightman & Kistler, 2005).

The ease of dichotic listening and selective attention to a single ear also occurs in most bilateral CI listeners. Goupell et al. (2016) tested 11 adult bilateral CI listeners and found nine listeners had relatively symmetric abilities in attending to the right or left ear. The other two listeners could easily attend to their right ear, but demonstrated great difficulty attending to their left ear despite explicit instruction. The inability to easily attend to the left ear was also found in two of 10 children with bilateral CIs—those who had the longest interimplant durations (Misurelli et al., 2020). Other bilateral CI listeners in a separate study (Goupell, Stakhovskaya, et al., 2018) had a similar difficulty perceiving sound presented to one of their ears with dichotic presentation, but this did not occur with monaural presentation. This subset of results in bilateral CI listeners appears to have some similarities to the loss of binocular vision caused by the amblyopia phenomenon (Barrett et al., 2004; Kaplan et al., 2016; Whitton & Polley, 2011), as one explanation for the data was that these few listeners were unable to attend to an ear because its input was not perceived under dichotic presentation.

While there is mounting evidence for difficulty in understanding speech with asymmetrical inputs, particularly for accessing spatial-hearing benefits, the heterogeneity of the CI population (differences in performance due to biological, surgical, and device-related factors; Litovsky et al., 2012) makes it difficult to separate effects due to functionally asymmetrical performance from developmental problems like an amblyopia-like effect (i.e., amblyaudia; see Kaplan et al., 2016, for a review). Therefore, the purpose of this study was to determine whether there are consequences to auditory processing of speech through asymmetrical inputs in the absence of developmental issues, which are often difficult to assess (Whitton & Polley, 2011). To avoid the variability that occurs in clinical populations, we produced asymmetrical inputs through a signal processing technique called “channel vocoding” (Dudley, 1939; Shannon et al., 1995), degrading the signals parametrically and independently in each ear. We hypothesized that it would be more difficult to attend to a relatively poorer ear and ignore a relatively better ear. We simulated two forms of hearing asymmetry with a CI in this study. In Experiments 1 and 2, we simulated differential spectral resolution, which is related to physical placement of the electrodes, how electrical fields interact and overlap in the cochlea, and neural degeneration (e.g., Croghan et al., 2017; Friesen et al., 2001). In Experiment 3, we simulated differential electrode array insertion depths across the ears to produce a frequency-to-place mismatch or “shift” of frequency information. These conditions simulate some of the asymmetrical ear differences that could occur in the CI population.

Experiment 1: Selective attention and asymmetrical spectral resolution

Listeners and equipment

Ten normal-hearing (NH) listeners between the ages of 20 and 35 years were tested in this experiment (mean age = 24.7 years), a sample size based on previous similar work using vocoded speech with this type of experiment (Goupell et al., 2016). The listeners had typical hearing thresholds (≤20 dB hearing level [HL] air conduction thresholds) at 0.25, 0.5, 1, 2, 4, and 8 kHz, measured with an audiometer (Maico, MA41; Berlin, Germany). In addition, listeners were screened for asymmetrical hearing such that no listener had an interaural difference in threshold at any tested frequency of >10 dB. Seven listeners had previous exposure to vocoded speech, and all were native English speakers.

Stimuli were created on a personal computer using MATLAB (The MathWorks; Natick, MA). Listeners were presented with the stimuli via open-backed circumaural headphones (Sennheiser, HD650; Hanover, Germany). All testing was performed in a double-walled sound-attenuating booth (IAC; Bronx, NY) at the University of Maryland, College Park.

Stimuli

Stimuli were nonsense sentences with five keywords, each consisting of a name, verb, number, adjective, and object, and were the original recordings from Kidd et al. (2008). Each category had a closed set with eight possibilities per keyword, which are shown in Table 1. Words within each category had a range of similarity and confusability (e.g., “old” vs. “cold”).

Table 1 Matrix word corpus

Full size table

The keywords were randomly chosen with replacement for each ear; the same word could occur in both ears, which limits using a strategy of eliminating target word possibilities by identifying the interferer word (e.g., Bernstein et al., 2016). The individual words had different lengths and onset times. This occurred for even the same word because two different talkers produced the target and interferer words. Therefore, the words were not time aligned across the ears.

The talkers in the left and right ears were female (Talker 10, F0 = 209.5 Hz) and male (Talker 2, F0 = 109.8 Hz), respectively, similar to the stimuli used in Goupell et al. (2016). The stimuli were presented at an A-weighted sound pressure level of 65 dB.

Stimuli were either unprocessed or processed using a noise vocoder. The vocoding process included an analysis stage in which the unprocessed stimuli were passed through a filter bank with 4, 8, or 16 contiguous channels. The bandpass filters in the filter bank were fourth-order Butterworth filters. The corner frequencies on the bandpass filters were logarithmically spaced and covered a frequency range from 300 to 8500 Hz (see Table 2). The envelope from each channel was extracted using a second-order low-pass filter with a 400-Hz cutoff frequency. The envelopes were used to modulate narrowband noise carriers after the carriers were filtered with the same bandpass analysis filters. The vocoded stimuli were synthesized by summing the channels into the acoustic waveform and were normalized to have the same root-mean-square energy as the unprocessed stimuli.

Table 2 Corner frequencies for the bandpass filters, arithmetic center frequencies (CFs), and bandwidths (BWs) for the channels used in the vocoder for Experiment 1

Full size table

Note that changing the number of channels from 4 to 8 to 16 channels changes more than just spectral resolution. With 4 channels, there is likely to be much greater mismatch between the input and output frequencies than with 16 channels (e.g., formant information is smeared relatively more with fewer channels). In addition, the interaurally uncorrelated narrowband noise carriers add random fluctuations to the temporal envelopes of each channel, and the rate of those fluctuations are dependent on the narrowband noise bandwidth (e.g., Goupell & Litovsky, 2014).

Procedure

Listeners were seated at a computer, and the experiment was performed with a graphical user interface that was programmed in MATLAB (The MathWorks; Natick, MA). The listener initiated each trial by pressing a button on the computer interface. Stimuli were presented dichotically over headphones, meaning different sentences were presented to each ear simultaneously and spoken by a different talker. The instructions to the listeners were to report the words in the left ear and ignore the words in the right ear. They responded by selecting one of eight possible keyword choices for each category on a grid. Another button on the screen was then pressed to confirm the selections and end the trial. Listeners were forced to guess if they did not know one of the words in the sentence. The words presented in each ear were chosen at random and, in some cases, the same word was presented to both ears.

The number of channels was independently varied in each ear. Listeners were presented with 4 target ear × 4 interferer ear = 16 combinations of resolution levels (4, 8, 16 channels, and unprocessed control). A monaural control measurement was not performed because performance would have likely been close to 100% correct for most listeners and most conditions, at least for 8 channels or more (Waked et al., 2017); the strong ceiling effect would have limited its usefulness to interpret the amount of interference caused by the contralaterally presented interferer. The experiment was performed using a method of constant stimuli where all 16 combinations of stimuli were presented 10 times in a randomized order per block. Listeners performed three blocks of 160 trials for a total of 480 trials. Therefore, there were 150 total keywords per condition (5 keywords × 10 trials × 3 blocks). Testing for this experiment took approximately 1.5 hours.

Data analysis

The percentage of correct responses (PC) and percentage of across-ear confusions or intrusions (PI; i.e., reporting the word presented to the nontarget ear) were calculated (Brungart et al., 2001; Bryden et al., 1983). When the target and interferer were the same word, the response was counted as correct and not an intrusion. PC and PI scores were transformed to rationalized arcsine units (RAUs; Studebaker, 1985) to better follow the assumption of homogeneity of variance required for an analysis of variance (ANOVA). The data were analyzed using a two-way repeated-measures ANOVA, with factors target-ear channels (four levels: 4, 8, 16 channels, and unprocessed control) and interferer-ear channels (four levels: 4, 8, 16 channels, and unprocessed control). In cases where the assumption of sphericity was violated, a Greenhouse-Geisser correction was used. Bonferroni-corrected two-sample two-tailed paired t tests were used for post hoc comparisons.

Results

Figure 1a shows the PC data. Performance was near ceiling for 8 and 16 channels, as well as unprocessed speech, in the target ear. As the number of channels in the target ear increased, performance increased, F(1.5, 13.8) = 29.0, p < .0001, \( {\eta}_p^2 \)= 0.76. Post hoc tests showed that all conditions were different than the others (p < .05 for all six comparisons). As the number of channels in the interferer ear increased, performance decreased, F(3, 27) = 97.6, p < .0001, \( {\eta}_p^2 \)= 0.92. Post hoc tests showed that all conditions were different than the others (p < .001 for all six comparisons). There was no significant interaction between target-ear channels and interferer-ear channels, F(4.2, 37.9) = 1.9, p = .13, \( {\eta}_p^2 \) = 0.18. Finally, the matched conditions (i.e., same number of channels in both ears; filled symbols of Fig. 1a) were compared in post hoc tests. The matched 4-channel condition was lower than the other three conditions (p < .05 for all three comparisons), but no other comparisons were significantly different (p > .05).

Figure 1b shows the PI data. As the number of channels in the target ear increased, intrusions decreased, F(3, 27) = 75.4, p < .0001, \( {\eta}_p^2 \)= 0.89. Post hoc tests showed that all comparisons were significantly different (p < .05 for all). As the number of channels in the interferer ear increased, intrusions increased, F(3, 27) = 31.9, p < .0001, \( {\eta}_p^2 \) = 0.78. Post hoc tests showed that all comparisons were significantly different (p < .05 for all). There was no significant interaction between target-ear channels and interferer-ear channels, F(9, 81) = 1.47, p = .17, \( {\eta}_p^2 \) = 0.14. Finally, the matched conditions (filled symbols of Fig. 1b) were compared in post hoc tests. The matched 4-channel condition was higher than the 8-channel condition (p = .016) and the 16-channel condition (p = .042). There were no differences for the other comparisons (p > .05 for all four comparisons).

Discussion

Selective attention to a single ear for dichotically presented sentences was affected by the spectral resolution in both target and interferer ears. Figure 1a shows that increased spectral resolution produced increased speech understanding for the target ear, consistent with other studies investigating spectral resolution with vocoders (e.g., Friesen et al., 2001). The key finding of this experiment was that the spectral resolution of the interfering ear also affected speech understanding in the target ear, where increasing spectral resolution in the interferer ear produced decreased speech understanding for the target ear. The highest performance was not for the unprocessed conditions for each ear; rather, highest performance was achieved for the 4-channel interferer conditions. This suggests that the signals in the interfering ear are being processed, and it takes some resources to ignore them.

The results are in line with those found for NH listeners attending to an unprocessed target with a same-sex interferer in the same ear, and trying to ignore a noise-vocoded interferer in the other ear with the number of channels parametrically varied (Brungart et al., 2005). These data showed that the intelligibility of the interfering ear is critical; the more intelligible the interferer speech, the more difficult it was to selectively attend to the target ear. Similar results concerning the effect of interferer intelligibility occurred for vocoded stimuli with matched spectral resolution, but nonoverlapping frequency bands across the ears when there was a less intense but more intelligible interferer (Gallun et al., 2007a) or in the presence of masking noise (Kidd et al., 2005). Together, these data are evidence that listeners had difficulty attending to the ear with poorer resolution and speech understanding compared with the ear with better resolution and speech understanding. It may be that it is the speech understanding performance, rather than the resolution of the stimuli, that drives the size of the effect (Dai et al., 2017).

Figure 1b provides further insight on why performance decreased as the target ear becomes more degraded than the interfering ear. PI decreased as the target-ear resolution increased. The largest PI occurred when the interfering ear was unprocessed. This further supports the interpretation of the PC data that the listeners were having difficulty ignoring the ear with the signal that is easier to understand because they more often reported the words from the interferer ear. The average intrusion rate was 43.7% (normalized over the number of incorrect responses), which is much lower than the intrusion rate of about 90% seen in several other studies (Arbogast et al., 2002; Brungart & Simpson, 2002; Brungart et al., 2001; Kidd et al., 2005) and closer to the intrusion rate of less than 65% seen in Gallun et al. (2007a).

Experiment 2: Divided attention in NH and CI listeners

Experiment 1 showed many PC scores near ceiling performance. To mitigate possible ceiling effects, we performed a divided attention task, which is much more demanding for listeners because they attend to the words presented to both ears. Both NH and CI listeners were tested because there is no data on this task yet in the CI population.

Methods

The same 10 NH listeners were tested as in Experiment 1, and the same stimuli were used. The procedure remained the same, except that a divided attention task was used rather than a selective attention task. In the divided attention task, listeners were to attend to both ears. After stimulus presentation, they were asked to report the words presented to just the left or right ear, the ear randomly chosen in each trial with 50% a priori probability (Gallun et al., 2007b).

In addition, seven postlingually deafened bilateral CI listeners performed this task. They were ages 44–73 years (mean age = 59.0 years). They all used Cochlear Ltd. devices (Cochlear Ltd.; Sydney, Australia). CI listener information is located in Table 3. They were presented stimuli to their clinically fit everyday sound processors set to their most commonly used program. Stimuli were presented over Freedom TV/HiFi cables (Cochlear Ltd.; Sydney, Australia) to the direct audio input, similar to the methods used in Goupell et al. (2016) and Misurelli et al. (2020). The stimuli were nominally presented at 65 dB-A. A loudness adjustment procedure was performed before the testing to ensure that stimuli were presented at a comfortable loudness in each ear separately, and also that the loudness was balanced across ears.

Table 3 CI listener demographics in Experiment 2

Full size table

Results

For the NH listeners, the average PC scores decreased from approximately 91.3% in Experiment 1 to 58.3% in Experiment 2, but the pattern of results across experiments was similar. Performance was symmetrical across the ears, so the data were averaged over the different right (average PC = 58.5%) and left (average PC = 58.0%) target ears. Similar to Experiment 1, separate two-way repeated-measures ANOVAs were performed on RAU-transformed PC and PI values, with factors target ear and interferer ear. Post hoc tests and corrections to statistical calculations were approached in the same way.

Figure 2a shows that as the number of channels in the target ear increased, performance increased, F(1.3, 11.7) = 51.6, p < .0001, \( {\eta}_p^2 \) = 0.85. Post hoc tests showed that all conditions were different than the others (p < .05 for all six comparisons). As the number of channels in the interferer ear increased, performance decreased, F(1.18, 10.6) = 17.1, p < .0001, \( {\eta}_p^2 \)= 0.66. Post hoc tests showed that all conditions were different than the others (p < .0005 for five comparisons), except that the 8-channel interferer condition was not different than the 16-channel interferer condition (p > .05). There was no significant interaction between target-ear resolution and interferer-ear resolution, F(9, 81) = 1.37, p = .22, \( {\eta}_p^2 \)= 0.13.

Figure 2b shows that as the number of channels in the target ear increased, intrusions decreased, F(3, 27) = 4.45, p = .012, \( {\eta}_p^2 \) = 0.33. Post hoc tests showed that the unprocessed condition was lower than 16 channels (p = .031); no other differences were significant (p > .05 for all). As the number of channels in the interferer ear increased, intrusions did not change, F(3, 27) = 2.62, p = .071, \( {\eta}_p^2 \) = 0.23. There was a significant interaction between target-ear resolution and interferer-ear resolution, F(9, 81) = 3.11, p = .003, \( {\eta}_p^2 \) = 0.26. The interaction primarily occurred because the unprocessed interferer conditions had a trend of the same or relatively fewer number of intrusions when there was 4, 8, or 16 channels in the target ear, but had relatively more intrusions when the target ear was unprocessed. Post hoc testing (Bonferroni corrected for 120 comparisons) revealed no statistically significant pairs (p > .05 for all).

Data for the NH listeners can be compared with the CI listeners in this study. Generally, the CI data (horizontal lines and shaded area) best correspond to the symmetric vocoding conditions (8 or 16 channels; filled symbols in Fig. 2).

Discussion

Experiment 2 showed a substantial drop in overall performance as would be expected from the sharing of resources to process both speech streams (Gallun et al., 2007b), and yet still confirmed the effects in Experiment 1. Both Experiments 1 and 2 (Figs. 1a and 2a, respectively) showed that performance increased with increasing number of channels in the target ear and decreased with increasing number of channels in the interferer ear. The intrusions for this experiment revealed a different pattern than in Experiment 1, where small differences occurred between conditions, and the fewest overall intrusions occurred for the unprocessed target (see Fig. 2b). The reason for the different pattern was likely a result of listeners attempting to process and remember the stimuli in both ears simultaneously. Also, a new pattern emerged in this experiment in that the cases with matched number of channels generally scored near the highest intrusions, and there was a significant interaction (see Fig. 2b). One interpretation of this is that it was more common to report the word in the incorrect ear if the stimuli sounded similar. Finally, there was no evidence of a right-ear bias in selective attention or advantage for targets in that ear. This is contrary to Gallun et al. (2007b), who found a greater number of intrusions from the right ear during divided listening. The difference in findings may have been a result of the differences in the stimuli, where Gallun et al. (2007b) used five-channel tone-vocoded stimuli in background noise, the carrier frequencies were randomly chosen and different across the ears, and the speech materials were the call-response-measure sentences (Bolia et al., 2000). Alternatively, it could have been a result of practice attending to the left ear from Experiment 1, since the same listeners participated in both experiments.

Data from adult bilateral CI listeners were also collected for comparison. They showed no relative difficulty in performing the divided attention task compared with the NH listeners. This is consistent with the similar performance between these groups on the selective attention task performed in Goupell et al. (2016). The CI listeners had average performance similar to the NH listeners at the matched channel conditions (8 or 16 channels; see filled symbols in Fig. 2a). Comparison of the demographic information in Table 3 shows that for six listeners, durations of CI use were less than 6 years different across the ears, and the duration of deafness was the same except for S3. Therefore, these CI listeners could be argued to be a relatively symmetric group (e.g., compare against the CI listeners in Goupell, Stakhovskaya, et al., 2018). No formal statistical comparison was performed because of the age confound (young NH listeners compared with middle-aged and older CI listeners). Dichotic listening becomes more difficult with age (Humes et al., 2006), and aging effects in binaural tasks have been recently reported (Bernstein et al., 2020). Note that asking our listeners to remember two sentences each with five keywords each put a massive demand on their short-term memory (see Baddeley et al., 2015, for a review). Demands on short-term memory likely exacerbate the age confound between our NH and CI listeners. Future work could consider using fewer keywords and providing age-matched controls to this task (Cleary et al., 2018).

Finally, it should be noted that symmetry in the CI listeners cannot fully be assumed. Monaural control measurements should be added using a relatively difficult open-set speech corpus. Doing so would allow us to evaluate the effect of asymmetry in the CI listeners (Bernstein et al., 2020; Goupell, Stakhovskaya, et al., 2018).

Experiment 3: Asymmetrical spectral shift

Speech degradation through a vocoder can occur with a number of different signal manipulations (Shannon et al., 1998). One of the most detrimental signal manipulations is to introduce frequency-to-place mismatch (Dorman et al., 1997; Rosen et al., 1999), which simulates shallow CI insertion depths (Landsberger et al., 2015). Therefore, we hypothesized that this type of degradation would produce similar asymmetries in the ability to selectively attend to an ear.