Elsevier

Cortex

Volume 134, January 2021, Pages 320-332
Cortex

Research Report
Dynamic acoustic salience evokes motor responses

https://doi.org/10.1016/j.cortex.2020.10.019Get rights and content

Abstract

Audio-motor integration is currently viewed as a predictive process in which the brain simulates upcoming sounds based on voluntary actions. This perspective does not consider how our auditory environment may trigger involuntary action in the absence of prediction. We address this issue by examining the relationship between acoustic salience and involuntary motor responses. We investigate how acoustic features in music contribute to the perception of salience, and whether those features trigger involuntary peripheral motor responses. Participants with little-to-no musical training listened to musical excerpts once while remaining still during the recording of their muscle activity with surface electromyography (sEMG), and again while they continuously rated perceived salience within the music using a slider. We show cross-correlations between 1) salience ratings and acoustic features, 2) acoustic features and spontaneous muscle activity, and 3) salience ratings and spontaneous muscle activity. Amplitude, intensity, and spectral centroid were perceived as the most salient features in music, and fluctuations in these features evoked involuntary peripheral muscle responses. Our results suggest an involuntary mechanism for audio-motor integration, which may rely on brainstem-spinal or brainstem-cerebellar-spinal pathways. Based on these results, we argue that a new framework is needed to explain the full range of human sensorimotor capabilities. This goal can be achieved by considering how predictive and reactive audio-motor integration mechanisms could operate independently or interactively to optimize human behavior.

Introduction

The human nervous system closely integrates perception and action. This integration allows the body to respond to and align itself with a dynamically changing environment. Although neural and cognitive models have increased our understanding of perception-action coupling in voluntary actions, understanding of this coupling in involuntary action remains limited. In particular, it is not known whether highly complex and continuous stimuli, such as speech or music, can evoke involuntary muscle responses. For example, the startle response is an involuntary motor response to a stimulus that stands out relative to neighboring stimuli and, consequently, is perceived as salient. This reflex normally occurs to extremely sudden sounds, and is mediated by a brainstem-spinal pathway. If a muscle response can also be evoked by prominent changes embedded within complex and continuous auditory streams such as speech or music, it would imply that a similarly fast and direct auditory-brainstem pathway might also contribute to complex listening tasks. The present experiment examined whether involuntary motor responses are evoked by changes in acoustic features within complex and naturalistic musical excerpts, and whether these same acoustic features are perceived as salient. We aimed to establish a three-fold link between 1) involuntary muscle responses and acoustic features within naturalistic stimuli, 2) involuntary muscle responses and perceived salience, and 3) acoustic features and perceived salience.

The first aim of this study was to examine whether involuntary muscle responses can be evoked by changes within highly complex auditory stimuli. The role of involuntary audio-motor integration has received relatively little theoretical and empirical attention, relative to voluntary audio-motor integration. Recent neural and cognitive models for sensorimotor integration emphasize how voluntary movements can filter or enhance perception through prediction (Blakemore, Frith, & Wolpert, 1999; Morillon & Schroeder, 2015; Wolpert, Ghahramani, & Jordan, 1995). Actions could be used to focus attention on predictable points in time, potentially through oscillatory neural activity that entrains attention-driven perceptual gaiting to external stimuli (Morillon & Baillet, 2017; Morillon, Hackett, Kajikawa, & Schroeder, 2015; Morillon, Schroeder, & Wyart, 2014; Schroeder & Lakatos, 2009; Schroeder, Lakatos, Kajikawa, Partan, & Puce, 2008; Schroeder, Wilson, Radman, Scharfman, & Lakatos, 2010). This perspective helps us understand how we align our movements and/or attention with regularities, or predictable patterns, such as temporal structures within speech or music. What is not yet understood is whether complex external stimuli that lack a clearly predictable structure may still engage the motor system (Bendixen, SanMiguel, & Schröger, 2012). Organisms may not always have the capacity or information available to predict accurately, and we must be able to respond to changes that are unpredictable.

Neural pathways for startle and orienting responses are useful in this case to respond to unexpected changes in the environment. The vertebrate motor system is known to show an involuntary response to ambient changes in sound. Sounds that are highly intense and infrequent relative to other ambient sounds evoke a startle reflex (Brown et al, 1991a, 1991b; Davis, Gendelman, Tischler, & Gendelman, 1982). The caudal brainstem has been identified as the origin of this reflex, which is thought to propagate from the auditory nerve to the bulbopontine reticular formation to the motor periphery. This response allows humans and other vertebrates to detect and automatically respond to a sudden, unexpected change in the environment quickly (<150 ms; Brown et al., 1991a). Importantly, even sound changes that are not highly intense can evoke responses in sensory receptor organs. The orienting response, such as turning the eyes or head toward the source of a sound (Johnson & Lubin, 1967; Sokolov, 1963), only depends on a change in sound parameters regardless of the magnitude of those changes. The orienting response occurs immediately following a change (e.g., a sound on or a sound off), and it disappears when the feature that changed maintains the same state. The orienting response involves a combination of neuronal firing, autonomic and muscle responses, and may also rely on the brainstem (Sokolov, 1963), potentially the superior colliculus (Goodale & Murison, 1975). Finally, the ability to quickly adjust or adapt ongoing behavior to changes in ambient sounds may further rely on the cerebellum (Sokolov, Miall, & Ivry, 2017), potentially through fast cerebellar-to-frontal cortex transmission (Schwartze & Kotz, 2013, 2016). Thus, humans possess subcortical machinery whereby features of sound can innervate the muscles without conscious control.

The utility of these reflexive pathways for survival is clear, such as responding efficiently to potential threats (Sokolov, 1963). Beyond their use for survival, these phylogenetically older pathways may also enable organisms to respond to dynamic and complex acoustic stimuli such as spoken language or music. The question remains, in which contexts can involuntary audio-motor innervation occur? Do only highly salient and rare sounds innervate the muscles or can changes along a range of acoustic features innervate the muscles? The few studies that have examined spontaneous human movement to sounds demonstrated that certain features of a musical texture, such as the entrance of different instruments, can influence the amount and quality of spontaneous overt (observable) movement (Burger, Thompson, Luck, Saarikallio, & Toiviainen, 2014; Hurley, Martens, & Janata, 2014; Janata, Tomic, & Haberman, 2012). However, these studies sought to document alignment of overt movement with general aspects of musical structure, for example, a regular beat. It remains an open question whether features of sound that do not correspond to a predictable structure within a complex sound stream such as music, can evoke involuntary, subthreshold muscle activity. Music and speech contain a wealth of acoustic features that often converge on a predictable temporal structure (Jones, 2001; Rothermich, Schmidt-Kassow, & Kotz, 2012), yet they also contain unpredictable events that may also engage the motor system. Given that the motor system is flexibly attuned not only to the buildup of regularities over time (Morillon, Schroeder, Wyart, & Arnal, 2016), but also to instantaneous relative changes in stimulus features (Sokolov, 1963), the motor system may be sensitive to a range of features of sound even within complex listening contexts. Involuntary audio-motor integration could operate across many or all sound processing contexts, even contexts that require detection of and alignment to patterns, such as, tapping along with a beat in music (cf. Repp, 2005, Repp & Su, 2013) or coordinating responses to a conversation partner (e.g., Schultz et al., 2016). We address this issue by examining whether a range of acoustic changes in music trigger involuntary motor responses.

The second aim of the study was to examine the link between acoustic features that may evoke involuntary motor responses and the perception of salience. The human auditory system is highly attuned to change. This change sensitivity is captured by the concept of salience. Salience is the perception of a stimulus as distinct, prominent, or conspicuous relative to neighboring stimuli. Salience is evoked by a change or contrast, often sudden or sharp, along a particular stimulus feature (Borji, Sihite, & Itti, 2013; Parkhurst, Law, & Niebur, 2002). Visual salience and the ways in which visual features highlight points in space are well-documented (Borji et al., 2013; Gottlieb, Kusunoki, & Goldberg, 1998; Itti, 2006; Parkhurst et al., 2002; Thompson & Bichot, 2004). Acoustic salience is far less studied, despite its utility in highlighting points in time with high precision (Benoit et al., 2014; Dalla Bella, Benoit, Farrugia, Schwartze, & Kotz, 2015; Ellis & Jones, 2009; Fernandez-Del-Olmo et al., 2013; Rodger & Craig, 2016). The neural signatures of acoustic salience are well documented. For instance, the mismatch negativity (MMN) response is a comparative evoked potential that is understood to be a measure of pre-attentive change detection (Schröger, 1997; Schröger & Winkler, 1995). However, we still know little about how particular acoustic features contribute to the perception of salience, particularly within naturalistic stimuli that combine multiple features. Here, we aim to isolate particular acoustic features within music and identify those that contribute to the perception of salience, as well as to involuntary motor responses.

Salience can be derived from multiple acoustic features. Acoustic features such as amplitude and intensity (perceived as loudness), frequency (perceived as pitch), spectral properties of individual sounds (timbre), and interactions between multiple sounds (harmony) can distinguish an individual event at a particular point in time from an immediately preceding event(s), such as, when a breaking object produces a sudden crash while you are listening to someone speaking (Caclin et al., 2006; Caclin, McAdams, Smith, & Winsberg, 2005). The abrupt changes in loudness and timbre features are potential sources of salience that can signal an event at a particular point in time (Chon & McAdams, 2012; McAdams, Winsberg, Donnadieu, De Soete, & Krimphoff, 1995). Here we focus on acoustic features that can define how sound unfolds over time to examine motor responses to instantaneous acoustic events that do not necessarily depend on predictable patterns in acoustic events. We further investigate possible influences of predictable acoustic patterns (e.g., beat structures and harmonic context) on motor responses. We examined amplitude and intensity, timbre, and harmonic features (see Method for full descriptions). Amplitude and intensity are known to contribute to the startle response (Brown et al., 1991a; Davis et al., 1982) and, in the context of music, these features correspond to perceived arousal levels conveyed by the music (Dean, Bailes, & Schubert, 2011). Acoustic salience based on timbre has received little attention. Here, we focus on spectral measures that capture global features of the sound spectrum, and have been found to influence the discriminability of different timbres (Caclin et al., 2005; McAdams et al., 1995) and to elicit an MMN response when there is a change along this dimension (Caclin et al., 2006). We therefore examined whether amplitude, intensity, timbre, and harmonic features both evoke involuntary motor responses and contribute to the perception of salience. If so, salient events within complex sounds such as music may evoke involuntary motor responses. This is a crucial missing piece to understanding the full sensorimotor capacity of the human nervous system. In addition, to examine whether involuntary motor responses to, and perceived salience of, these features was dependent on predictability, we also investigated involuntary motor responses to high-level acoustic features that represent predictability in harmony (harmonic change) and timing (beat and downbeat). We hypothesized that changes in intensity, amplitude, timbre, and harmonic features should evoke spontaneous muscle activity and increased salience ratings, regardless of the degree of harmonic or temporal predictability.

The overall goal of the present study was to establish whether there is a direct link between salient acoustic features and involuntary motor responses within the context of a continuous and dynamic sound stream, in this case music. We investigated the dynamic link between continuous changes in acoustic features, salience perception, and involuntary responses in the motor periphery. First, we address whether perceived salience could be a mechanism behind involuntary motor responses. Second, we investigate which acoustic features elicit changes in salience perception as music unfolds. Third, we determine whether changes in particular acoustic features within a complex sound stream evoke involuntary muscle responses while passively listening to music. Thus, we investigate how the peripheral motor system dynamically responds to salient acoustic features. Participants with little to no musical training listened to musical excerpts once while remaining still during the recording of their muscle activity using surface electromyography (sEMG), and again while they continuously rated perceived salience within the music using a slider. The slider measurement provided a means to capture salience perception in a dynamic manner, allowing sensitivity to dynamically unfolding acoustic signals that may evoke salience. Musical excerpts were 40-sec segments of commercially available musical recordings of various genres, tempi, and temporal regularity (i.e., beat salience). We applied time series analysis with dynamic time warping (DTW) to examine the correspondence between various dynamic acoustic features, salience ratings, and muscle activity. We hypothesized that 1) continuous ratings of perceived salience correspond to spontaneous motor activity, 2) continuous ratings of salience correlate with acoustic features as they unfold in time, that is, that certain acoustic features are perceived as salient, and 3) sEMG activity correlates with acoustic features, that is, that certain acoustic features evoke motor responses.

Section snippets

Method

We report how we determined our sample size, all data exclusions (if any), all inclusion/exclusion criteria, whether inclusion/exclusion criteria were established prior to data analysis, all manipulations, and all measures in the study.

The relationship between perceived salience and motor activity

To test the hypothesis that salience ratings correspond to spontaneous motor activity, the relationship between continuous motor activity in three sEMG electrodes (hands, lower arm, upper arm) and continuous salience ratings were compared to estimated chance levels using a LMEM. All relationships were significantly above chance (ps < .002, OR = 10.81; see Fig. 2). This relationship significantly decreased in strength from the hand to the lower arm and from the lower arm to the upper arm (ps

Discussion

The present study examined whether involuntary motor responses are evoked by changes in acoustic features within complex, continuous, and naturalistic musical excerpts, and whether these same acoustic features are perceived as salient. We show that changes in several acoustic features within music corresponded to changes in involuntary motor responses and to perceived acoustic salience. Amplitude, intensity, and spectral centroid were perceived as the most salient features in music, and they

Conclusion

We provide evidence for auditory-motor resonance, an automatic innervation of peripheral motor activity by acoustic events within a complex stream that then leads to a temporal alignment of acoustic and motor fluctuations. As a purely stimulus-driven mechanism that links external events with internal motor activity, resonance closes the theoretical gap between voluntary, memory-driven behaviors, and the origins of those behaviors. Resonance may also play an important role in voluntary movement

CRediT author statement

Benjamin G. Schultz: Conceptualization, Formal analysis, Methodology, Software, Data curation, Project administration, Resources, Visualization, Writing - Original draft of Methods and Results sections, Reviewing and Editing. Rachel Brown: Writing - Original draft of Introduction and Discussion sections, Reviewing and Editing. Sonja Kotz: Conceptualization, Supervision, Resources, Reviewing, and Editing.

Open practices

The study in this article earned an Open Data badge for transparent practices. Materials and data for the study are available at https://doi.org/10.34894/JQAEIR.

Acknowledgments

This work was supported by funding from the European Union's Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie grant agreement No 707865 to R.M.B. and S.A.K.

Study data, digital study materials, and analysis code is publicly available upon reasonable at https://doi.org/10.34894/JQAEIR. The archived audio files contain copyrighted materials and will be shared unconditionally on request by contacting: [email protected].

No part of the study

References (61)

  • C.E. Schroeder et al.

    Low-frequency neuronal oscillations as instruments of sensory selection

    Trends in Neurosciences

    (2009)
  • C.E. Schroeder et al.

    Neuronal oscillations and visual amplification of speech

    Trends in Cognitive Sciences

    (2008)
  • C.E. Schroeder et al.

    Dynamics of active sensing and perceptual selection

    Current Opinion in Neurobiology

    (2010)
  • E. Schröger et al.

    Presentation rate and magnitude of stimulus deviance effects on human pre-attentive change detection

    Neuroscience Letters

    (1995)
  • M. Schwartze et al.

    A dual-pathway neural architecture for specific temporal prediction

    Neuroscience and Biobehavioral Reviews

    (2013)
  • M. Schwartze et al.

    Contributions of cerebellar event-based temporal processing and preparatory function to speech perception

    Brain and Language

    (2016)
  • A.A. Sokolov et al.

    The cerebellum: Adaptive prediction for movement and cognition

    Trends in Cognitive Sciences

    (2017)
  • C.-E. Benoit et al.

    Musically cued gait-training improves both perceptual and motor timing in Parkinson's disease

    Frontiers in Human Neuroscience

    (2014)
  • L. Bishop et al.

    Musical expertise and the ability to imagine loudness

    PLoS One

    (2013)
  • S.J. Blakemore et al.

    Spatio-temporal prediction modulates the perception of self-produced stimuli

    Journal of Cognitive Neuroscience

    (1999)
  • A.J. Blood et al.

    Emotional responses to pleasant and unpleasant music correlate with activity in paralimbic brain regions

    Nature Neuroscience

    (1999)
  • S. Böck et al.

    Enhanced beat tracking with context-aware neural networks

  • M. Bradley et al.

    Affective reactions to acoustic stimuli

    Psychophysiology

    (2000)
  • P. Brown et al.

    New observations on the normal auditory startle reflex in man

    Brain: a Journal of Neurology

    (1991)
  • P. Brown et al.

    The hyperekplexias and their relationship to the normal startle reflex

    Brain: a Journal of Neurology

    (1991)
  • B. Burger et al.

    Hunting for the beat in the body: On period and phase locking in music-induced movement

    Frontiers in Human Neuroscience

    (2014)
  • A. Caclin et al.

    Separate neural processing of timbre dimensions in auditory sensory memory

    Journal of Cognitive Neuroscience

    (2006)
  • A. Caclin et al.

    Acoustic correlates of timbre space dimensions: A confirmatory study using synthetic tones

    The Journal of the Acoustical Society of America

    (2005)
  • S.H. Chon et al.

    Investigation of timbre saliency, the attention-capturing quality of timbre

    The Journal of the Acoustical Society of America

    (2012)
  • S. Dalla Bella et al.

    Effects of musically cued gait training in Parkinson's disease: Beyond a motor benefit

    Annals of the New York Academy of Sciences

    (2015)
  • 1

    First authorship is shared between these authors.

    View full text