Elsevier

Neuropsychologia

Volume 159, 20 August 2021, 107916
Neuropsychologia

Kinematic boundary cues modulate 12-month-old infants’ segmentation of action sequences: An ERP study

https://doi.org/10.1016/j.neuropsychologia.2021.107916Get rights and content

Highlights

  • 12-month-old infants observed action sequences with or without kinematic boundary cues.

  • Kinematic boundary cues consisted of pre-boundary lengthening and pause.

  • Infant responses were measured via EEG.

  • Infants were sensitive to the presence of kinematic boundary cues.

  • Kinematic boundary cues modulated infants' processing of the subsequent action.

Abstract

Human infants can segment action sequences into their constituent actions already during the first year of life. However, work to date has almost exclusively examined the role of infants' conceptual knowledge of actions and their outcomes in driving this segmentation. The present study examined electrophysiological correlates of infants' processing of lower-level perceptual cues that signal a boundary between two actions of an action sequence. Specifically, we tested the effect of kinematic boundary cues (pre-boundary lengthening and pause) on 12-month-old infants' (N = 27) processing of a sequence of three arbitrary actions, performed by an animated figure. Using the Event-Related Potential (ERP) approach, evidence of a positivity following the onset of the boundary cues was found, in line with previous work that has found an ERP positivity (Closure Positive Shift, CPS) related to boundary processing in auditory stimuli and action sequences in adults. Moreover, an ERP negativity (Negative Central, Nc) indicated that infants’ encoding of the post-boundary action was modulated by the presence or absence of prior boundary cues. We therefore conclude that 12-month-old infants are sensitive to lower-level perceptual kinematic boundary cues, which can support segmentation of a continuous stream of movement into individual action units.

Introduction

A critical stage in action processing is the segmentation of the individual actions out of a continuous stream of movement. From everyday actions like making a cup of coffee, to more skilled actions such as knitting, in order to understand or reproduce these action sequences, the constituent actions (e.g., taking cafetière from the cupboard, opening coffee jar, spooning coffee out) must be identified and encoded. A wealth of research has examined the cognitive processes that underlie action segmentation during infancy. In an influential study, Baldwin et al. (2001) presented 10- to 11-month-old infants with videos showing continuous everyday action sequences, and found longer looking times to videos that paused during ongoing actions than videos that paused following the completion of these actions (e.g., a pause during vs. after the grasp of a towel). This finding demonstrated that already by the end of the first year of life, infants represent ongoing actions and the boundaries between these actions as separate structural elements of an action sequence. The indication that infants associate action boundaries with moments of action completion has led some scholars to suggest that infants track actors’ intentions and segment action sequences according to the fulfilment of related goals (e.g., Saylor et al., 2007). Such an account would posit that infants can rely on top-down processing, that is, the application of pre-existing knowledge and experience of actions, goals, and intentions to segment continuous motion into its constituent actions.

More recent work has suggested that action segmentation can also be supported by bottom-up processing of perceptual cues contained within the action stream, at least in adult populations. Studies of action production have revealed that boundaries in action sequences are marked by certain changes in kinematic properties of the movement forming the actions. For example, changes in motion velocity around a boundary (e.g., McAleer et al., 2014; Zacks et al., 2009) serve to lengthen the pre-boundary action and introduce a subsequent pause (i.e., a short absence of motion; Hilton et al., 2019). Adults make use of these kinematic boundary cues to segment observed action sequences, especially when top-down information relating to goal completion is not available. For example, when presented with a novel dance routine, participants with no dancing experience reported pause detection as a strategy for determining the boundaries between individual dance moves (Bläsing, 2014). In another study, Hemeren and Thill (2010) asked adults to identify boundaries between individual actions of video-taped everyday action sequences (e.g., unscrewing a bottle cap) by button-press. Then, to restrict application of top-down processing to identify the boundaries, the videos were converted to moving constellations of light points and presented to another group of participants. Although the participants who viewed the light-point constellations were no longer able to determine the nature of the action sequences, they identified boundaries in the same locations as participants who saw the unconverted videos. This work neatly demonstrates that the movement stream must have contained lower-level perceptual kinematic boundary cues, which could be sufficient to support action segmentation in adults.

This previous work raises the critical question of how the bottom-up processing of kinematic boundary cues develops throughout infancy and early childhood. Existing work in this area has typically examined processing of naturally-produced action sequences performed by humans (e.g., Baldwin et al., 2001; Hespos et al., 2010; Saylor et al., 2007), meaning that effects of top-down and bottom-up processing cannot be disentangled. However, initial evidence for bottom-up processing in early action segmentation was presented in a recent study with 10- to 14-month-old infants, revealing differential processing of motionless pauses inserted within or at the boundary between human actions that were unfamiliar to the infants and did not involve attainment of object-related goals (i.e., Olympic figure skating; Pace et al., 2020). This study design meant, however, that these action sequences were also naturally-produced human action, making it difficult to fully exclude experience-based top-down processing, and meaning that these sequences were not controlled for the occurrence of kinematic boundary cues other than pauses.

The present study aimed to investigate bottom-up processing of kinematic boundary cues in infants' action segmentation. To this end, the work was guided by potential parallels between speech and action processing, and aimed to extend existing research that has sought to isolate the bottom-up processes driving speech segmentation during infancy. As continuous information streams that can be organized hierarchically, action sequences and speech share structural similarities (Zacks, 2004). Speech segmentation can be driven by top-down knowledge-based processing, but infants initially capitalize on bottom-up cues embedded in prosody to determine boundaries between words and phrases in speech (e.g., Christophe et al., 2003). Three such prosodic boundary cues in several languages are silent pauses, lengthening of the pre-boundary material, and pre-boundary pitch-rise (e.g., Hockey and Fagyal, 1998; Peters, 2005; Tyler and Cutler, 2009). Furthermore, while infant-directed speech has been found to exaggerate prosodic boundary cues (e.g., Church et al., 2005), infant-directed action (also known as motionese; Brand et al., 2002) is characterised by an increase in frequency and duration of pauses, as well as increased duration of actions (Fritsch et al., 2005; Rohlfing et al., 2006), which could reflect an exaggeration of the kinematic boundary cues pause and pre-boundary lengthening. Drawing on arguments that infant-directed speech supports infants' speech processing by facilitating speech segmentation in certain language populations (Thiessen et al., 2005; Floccia et al., 2016), it is plausible that motionese supports infants' action processing by facilitating action segmentation, warranting a more fine-grained examination of infants’ processing of kinematic boundary cues. A well-established method for studying the processing of prosodic boundary cues in speech segmentation is EEG. For example, Holzgrefe-Lang et al. (2018) presented 6- to 8-month-old infants raised in German-speaking households with spoken sequences consisting of three proper names coordinated by und (“and”; e.g., “Mona und Lilli und Lola”). Critically, the sequences did or did not contain an internal grouping, which was marked by prosodic boundary cues (i.e., lengthening of the pre-boundary syllable, a rise of pitch at the second name, and the insertion of a subsequent pause). The onset of the prosodic boundary cues evoked a broadly-distributed slow-forming positivity in the event-related potential (ERP) which continued until the onset of the post-boundary word. This positivity was understood as an example of the Closure Positive Shift (CPS; Steinhauer et al., 1999), a well-established ERP marker of prosodic boundary processing in adults (e.g., Holzgrefe et al., 2013; Pannekamp et al., 2005) and children (Männel et al., 2013; Männel and Friederici, 2016).

The CPS in adults has also been found in response to pre-boundary lengthening and pauses in non-speech auditory streams of information such as music (Glushko et al., 2016) or hummed speech (Pannekamp et al., 2005). These findings suggest that the CPS is not a language-specific marker of boundary processing, and have led some scholars to conclude that the component reflects domain-general cognitive processes involved in the segmentation of continuous streams of information (e.g., Gilbert et al., 2015). As further support for this assumption, recent work found an ERP positivity in response to kinematic boundary cues (i.e., pre-boundary lengthening, pause) embedded in visual sequences of hand actions performed on a ball with no discernible goal (e.g., sliding, shaking, lifting; Hilton et al., 2019). This ERP positivity was similar in spatial distribution and timing to the CPS. This work with adults thus identified pre-boundary lengthening and pause as kinematic boundary cues that signal the location of a boundary in an action sequence, and suggested the CPS as a candidate ERP component that reflects adults' and potentially infants’ processing of these cues.

Another ERP component that has been related to infants' action processing is the Negative Central (Nc) component, characterised as a negative peak over fronto-central electrodes, emerging between 300 and 900 ms following the onset of a visual stimulus (e.g., Nelson and Collins, 1991; Reynolds and Richards, 2005). The Nc is regarded as a measure of attention and memory activation during the first year of life (e.g., Richards et al., 2010; Reynolds and Richards, 2019), with a larger Nc amplitude reflecting greater engagement to and thus encoding of a visual stimulus. Nine-month-old infants showed a larger Nc amplitude to familiar than novel actions, probably reflecting greater allocation of attentional resources (e.g., Kaduk et al., 2016). Similarly, Schönebeck and Elsner (2019) found an Nc-like negativity in 14-month-old infants in response to images of an action outcome (e.g., hands holding two separated halves of a dumbbell; Meltzoff, 1995), with the negativity being larger when infants had witnessed actions after which this outcome was expected (i.e., attempts to pull the dumbbell apart) rather than unexpected (i.e., an action that could not have resulted in the dumbbell's separation). Taken together, these findings suggest that infants allocate greater attentional processes to familiar or expected actions and action outcomes, in this case reflected by larger Nc amplitudes. Conversely, Pace et al. (2013) presented 24-month-old infants with videos of interrupted and uninterrupted action sequences, and found a relatively larger ERP negativity in the 300–600 ms time window in response to interrupted sequences, which they interpreted as evidence that these infants devoted more attention to the interrupted than uninterrupted action sequences. In line with work examining the Nc in response to familiar and unfamiliar pictures (e.g., de Haan and Nelson, 1997; Reynolds and Richards, 2019), these examples show that it is sometimes difficult to predict whether infants will allocate more attention to familiar and expected or unfamiliar and unexpected actions or action outcomes, a difficulty which is further compounded by differences in study design, age of sample, and stimuli in this previous work.

In a more recent study that examined the Nc specifically in the context of action segmentation (Monroy et al., 2019), 8- to 11-month-old infants were trained by watching continuous sequences of pairs of goal-directed actions performed on a novel mechanical toy (e.g., sliding a switch, twisting a joystick). Some action pairs exerted a subsequent effect (i.e., a lamp illuminating after completion of the second action), but other pairs did not. A test phase followed, during which the infants were presented with sequences of two images, each depicting the mid-point of an action seen during training. The sequences were composed of either intact action pairs that had been presented during training, or deviant pairs consisting of previously-seen actions that had not been paired during training. The authors found an Nc in response to the second action of all pairs, demonstrating that the Nc component can be evoked by individual actions of an action sequence. Furthermore, the amplitude of the Nc was greatest in response to the second action of a deviant pair when the first action had been part of an action pair that triggered an action effect during the training phase. To explain this finding, the authors speculated that the infants could have chunked, or formed a single segment of, the two actions of those pairs that exerted an effect. Thus, the larger Nc could have indicated that presentation of the the first action triggered an expectation regarding the second action of the pair, which was violated by the second, deviant action. Irrespective of theoretical implications, these findings demonstrated that differences in Nc amplitude are a useful measure by which to examine infants' processing of individual actions in an action sequence. Building on this work, we therefore elected to adopt the Nc as a candidate component by which to examine infants’ bottom-up processing of kinematic boundary cues: If infants make use of kinematic boundary cues to determine the structure of action sequences, it would be expected that their processing of an action following these cues would differ from an action in the same position that is not preceded by kinematic boundary cues. Thus, the Nc could indicate whether infants process an action as either a continuation of the current action segment, or the initial action of a new segment.

The purpose of the current study was two-fold. First, we wanted to determine whether ERP components indicate that infants can detect kinematic boundary cues in an action sequence already by the end of the first year of life. Second, to determine that the processing of these cues supports infants' action segmentation, we investigated whether kinematic boundary cues would modulate the ERP response to a subsequent action. We therefore recorded ERP activity while 12-month-old infants watched sequences of actions that did or did not contain kinematic boundary cues. In order to be able to single out the contribution of bottom-up processing, and to accurately control the timing and movement of the actions, we followed previous work into action processing (e.g., Twomey et al., 2014) by presenting infants with videos of an animated character performing arbitrary sequences of three whole-body actions (i.e., turning, stretching, lifting). On no-boundary trials, the three actions were performed as a single continuous sequence with no kinematic boundary cues present, whereas on boundary trials, a boundary was marked by pre-boundary lengthening of the second action and a pause between the second and final action. In naturally-occurring action sequences, the duration of kinematic boundary cues are relatively brief, but EEG's high temporal resolution meant that online processing of these cues could be recorded. If the 12-month-old infants were sensitive to the presence of these kinematic boundary cues, a boundary-related CPS-like positivity between the second and third actions on boundary trials should be found (cf. Hilton et al., 2019; Holzgrefe-Lang et al., 2018). We also examined attention to and encoding of each action in the sequence as reflected by the Nc component. If kinematic boundary cues affected infants' processing of the subsequent action, between-condition differences in the Nc response to the third action should be demonstrated.

Section snippets

Participants

The final sample included twenty-seven 12-month-old infants (Mage = 11.7 months; SD = 0.7; 13 girls). All children were full-term, typically-developing, and monolingual from German-speaking households. Additionally, 14 infants were tested but their data were not included in analyses due to refusal to wear the EEG cap (n = 3), technical problems (n = 6), the participant not meeting sample inclusion criteria (infant raised in bilingual environment; n = 1), or attention to the screen did not meet

Boundary-related positivity analysis

A 3 (region: frontal, central, posterior) x 2 (condition: boundary, no-boundary) repeated-measures ANOVA on the mean maximum amplitude from all electrodes of interest revealed a significant main effect of condition, F(1, 26) = 73.11, p < .001, ηG2 = 0.34, resulting from a larger positivity in the boundary condition (M = 75.32 μV, SD = 13.91) than in the no-boundary condition (M = 56.15 μV, SD = 11.19; Fig. 2) across all electrodes. A significant main effect of region was also found, F(2,

Discussion

The current study examined the effect of two kinematic boundary cues, pre-boundary lengthening and pause, on 12-month-old infants' processing of an action sequence. Infants were presented with videos in which an animated character performed a sequence of three actions, with or without kinematic boundary cues to mark a boundary between the second and third action. ERP activity was recorded while infants viewed these sequences, and analyses revealed that the onset of kinematic boundary cues

Credit author statement

Matt Hilton: Conceptualization, Methodology, Software, Formal analysis, Writing – original draft, Visualisation. Isabell Wartenburger: Conceptualization, Methodology, Writing – Reviewing & Editing, Supervision, Funding acquisition. Birgit Elsner: Conceptualization, Methodology, Writing – Reviewing & Editing, Supervision, Funding acquisition.

Declaration of competing interest

None.

Acknowledgements

This work was supported by a grant from the German Research Foundation (DFG, project number 258522519) within the Research Unit Crossing the Borders FOR 2253 to IW (WA 2969/7–1) and BE (EL 253/6–1). With thanks to the families who took part, Jakob Junge for help with stimuli creation, Romy Räling for help with experimental design, and the Babylab team for help with data collection.

References (49)

  • M. Schönebeck et al.

    ERPs reveal perceptual and conceptual processing in 14-month-olds' observation of complete and incomplete action end-states

    Neuropsychologia

    (2019)
  • J.M. Zacks

    Using movement and intentions to understand simple events

    Cognit. Sci.

    (2004)
  • J.M. Zacks et al.

    Using movement and intentions to understand human activity

    Cognition

    (2009)
  • D. Baldwin et al.

    Infants parse dynamic action

    Child Dev.

    (2001)
  • B.E. Bläsing

    Segmentation of dance movement: effects of expertise, visual familiarity, motor experience and music

    Front. Psychol.

    (2014)
  • R.J. Brand et al.

    Evidence for ‘motionese’: modifications in mothers' infant‐directed action

    Dev. Sci.

    (2002)
  • R. Church et al.

    Infant-directed speech: final syllable lengthening and rate of speech

    Can. Acoust.

    (2005)
  • M. Craddock

    ERP Visualization: Within-subject confidence intervals [Blog post]

    (2016, November 28)
  • M. de Haan et al.

    Recognition of the mother's face by six-month-old infants: a neurobehavioral study

    Child Dev.

    (1997)
  • J.N. Fritsch et al.

    Detecting ‘When to Imitate’ in a social context with a human caregiver

  • A. Glushko et al.

    Neurophysiological correlates of musical and prosodic phrasing: shared processing mechanisms and effects of musical expertise

    PloS One

    (2016)
  • P.E. Hemeren et al.

    Deriving motor primitives through action segmentation

    Front. Psychol.

    (2010)
  • M. Hilton et al.

    Parallels in processing boundary cues in speech and action

    Front. Psychol.

    (2019)
  • B.A. Hockey et al.

    Pre-boundary lengthening: universal or language-specific? The case of Hungarian

    (1998)
  • Cited by (1)

    View full text