Short report
Sound context modulates perceived vocal emotion

https://doi.org/10.1016/j.beproc.2020.104042Get rights and content

Highlights

  • Perceived emotion of vocalizations is affected by the acoustical context.

  • Some musical features are processed similarly to communicative vocal signals.

  • Local judgments of an auditory scene may be influenced by global acoustic features.

Abstract

Many animal vocalizations contain nonlinear acoustic phenomena as a consequence of physiological arousal. In humans, nonlinear features are processed early in the auditory system, and are used to efficiently detect alarm calls and other urgent signals. Yet, high-level emotional and semantic contextual factors likely guide the perception and evaluation of roughness features in vocal sounds. Here we examined the relationship between perceived vocal arousal and auditory context. We presented listeners with nonverbal vocalizations (yells of a single vowel) at varying levels of portrayed vocal arousal, in two musical contexts (clean guitar, distorted guitar) and one non-musical context (modulated noise). As predicted, vocalizations with higher levels of portrayed vocal arousal were judged as more negative and more emotionally aroused than the same voices produced with low vocal arousal. Moreover, both the perceived valence and emotional arousal of vocalizations were significantly affected by both musical and non-musical contexts. These results show the importance of auditory context in judging emotional arousal and valence in voices and music, and suggest that nonlinear features in music are processed similarly to communicative vocal signals.

Section snippets

Background

When animals are highly aroused there can be many effects on their bodies and behaviors. One important behavioral consequence of physiological arousal is the introduction of nonlinear features in the structure of vocalizations (Briefer, 2012; Fitch et al., 2002; Wilden et al., 1998). These acoustic correlates of arousal include deterministic chaos, subharmonics, and other non-tonal characteristics that can give vocalizations a rough, noisy sound quality. Nonlinear phenomena are effective in

Participants

23 young adults (12 women, mean age = 20.8, SD = 1.3; 11 men, mean age = 25.1 SD = 3.2) with self-reported normal hearing participated in the experiment. All participants provided informed written consent prior to the experiment and were paid 15€ for their participation. Participants were tested at the Centre Multidisciplinaire des Sciences Comportementales Sorbonne Université-Institut Européen d’Administration des Affaires (INSEAD), and the protocol of this experiment was approved by the

Results

To control that stimuli with increasing levels of portrayed vocal arousal indeed had more nonlinearities, we subjected the 18 vocal stimuli to acoustic analysis with Praat (Boersma, 2011), using three measures of voice quality (jitter, shimmer, noise-harmonic ratio) commonly associated with auditory roughness and noise. All three measures scaled as predicted with portrayed vocal arousal (Fig. 1-top).

Repeated-measures ANOVAs revealed a main effect of portrayed vocal arousal on judgments of

Discussion

As expected, voices with higher levels of portrayed anger were judged as more negative and more emotionally aroused than the same voices produced with less vocal arousal. Both the perceived valence and emotional arousal of voices with high vocal arousal were significantly affected by both musical and non-musical contexts. However, contrary to what would be predicted e.g. by the aesthetic enjoyment of nonlinear vocal sounds in rough musical textures by death metal fans (Thompson et al., 2018) or

Data accessibility

Matlab files and stimuli to run the experiment, R files and python notebook to analyze the results, as well as a. fxb file for the guitar distortion plugin are available at the following URL:

https://nubo.ircam.fr/index.php/s/QAMG78HPymso26o

Funding

This study was supported by ERC Grant StG 335536 CREAM to JJA, and by a Fulbright Visiting Scholar Fellowship to ML.

CRediT authorship contribution statement

Marco Liuni: Conceptualization, Methodology, Software, Validation, Formal analysis, Writing - original draft, Writing - review & editing. Emmanuel Ponsot: Conceptualization, Methodology, Software, Validation, Formal analysis, Writing - original draft, Writing - review & editing. Gregory A. Bryant: Conceptualization, Visualization, Supervision, Writing - review & editing. J.J. Aucouturier: Conceptualization, Formal analysis, Funding acquisition, Writing - review & editing.

Acknowledgements

The authors thank Hugo Trad for help running the experiment. All data collected at the Sorbonne-Université INSEAD Center for Behavioural Sciences.

References (23)

  • D.T. Blumstein et al.

    The sound of arousal: The addition of novel non‐linearities increases responsiveness in marmot alarm calls

    Ethology

    (2009)
  • Cited by (7)

    • Vocal emotion adaptation aftereffects within and across speaker genders: Roles of timbre and fundamental frequency

      2022, Cognition
      Citation Excerpt :

      In adaptation, emotional perception is influenced by recent and preceding events. Other forms of contextual influence operate simultaneously: Vocal emotion perception is influenced by the musical or non-musical background (Liuni et al., 2020), semantic speech content (Bliss-Moreau et al., 2010), or input from other modalities (Baart & Vroomen, 2018; de Gelder & Vroomen, 2000). While these are examples that operate on relatively short times scales, contextual influences on emotional processing can also be observed over longer time scales.

    • Super Linguistics: an introduction

      2023, Linguistics and Philosophy
    View all citing articles on Scopus
    View full text