The temporal dynamics of infants' joint attention: Effects of others' gaze cues and manual actions
Introduction
Social attention to others' eyes, faces, and actions is foundational to how we communicate, learn about the social and physical world, regulate emotions, and develop attachments with others. Beginning at birth, infants attend preferentially to faces, and are most sensitive to the presence of eyes in a face (Acerra, Burnod, & de Schonen, 2002; Batki, Baron-Cohen, Wheelright, Connellan, & Ahluwalia, 2000; Johnson & Morton, 1991). In addition, newborn infants prefer to orient to faces displaying direct gaze (Farroni, Csibra, Simion, & Johnson, 2002), and show a rudimentary form of gaze following (Farroni, Massaccesi, Pividori, & Johnson, 2004). Some evidence suggests that newborns recognize their mother's face (e.g., Bushnell, 2001), and these recognition abilities continue to develop over the first few months (Nelson, 2001). Beginning around 10 weeks of age infants fixate more consistently on the internal features of a face than on the external features and contours, especially when the face is speaking (Haith, Bergman, & Moore, 1977; Hunnius & Geuze, 2004). By three months, infants begin to differentiate faces based on the social categories of gender and race (Kelly et al., 2005; Quinn, Yahr, Kuhn, Slater, & Pascalis, 2002).
This improvement in face perception continues to develop during the next few months, because infants engage often in dyadic interactions with their caregivers ensuring that faces are a prominent part of their visual experience (Lock & Zukow-Goldring, 2010; Lockman, 2000). Once they can sit without support and coordinate their reaches toward objects, infants' reliance on interactions with other people for stimulation begins to decline. By around six months of age, infants are much more likely to divide their attention between exploring objects with their eyes and hands and interacting with social partners (Lock & Zukow-Goldring, 2010). For the next few months they typically distribute their attention to either objects or social partners, but they still must learn to share their attention to a common referent with someone else. It is not until 9 to 12 months of age (at least according to most social-cognitive theorists) that infants attribute intentional states to social partners enabling them to engage in triadic interactions (Tomasello, 2008; Woodward, 2009), such as participating with others in joint attention to objects and establishing common ground (Bakeman & Adamson, 1984; Carpenter, Nagell, & Tomasello, 1998), pointing to objects communicatively (Carpenter et al., 1998), and expecting social partners to express interest in shared referents (Liszkowski, Carpenter, & Tomasello, 2007).
In order for infants to develop these skills they must first learn to coordinate their attention to their social partner with their attention to objects (Bertenthal, Boyer, & Harding, 2014). Although it is well established that this developmental transition occurs, little is known about how a preference for faces gives way to a more distributed view of the social world that includes not only faces, but bodies and actions, as well as objects. In general, attention is the front-end of encoding and interpreting all stimulus information encountered in the environment, and thus it is essential for not only learning to recognize and discriminate faces, but others' actions as well. How do infants decide where to look from moment-to-moment when confronted with not only a dyadic partner but also an assortment of objects, other people, and events in their optic arrays? Early on, infants' orienting to stimuli in the environment is primarily under exogenous stimulus-driven control, but over time they begin to also develop endogenous control over their attention (Johnson, 2011; Mundy & Jarrold, 2010). As such, they begin modulating their attention in response to the actions of their social partner as well as the context (Bertenthal & Boyer, 2015). Indeed, this is exactly what is necessary for infants to follow the gaze direction of a social partner during shared attention. If infants could not modulate their attention, then they would simply continue to be guided by their bias for faces, but the development of joint attention suggests otherwise.
Although there has been considerable research investigating the social cognitive prerequisites for joint attention, such as shared intentions or common ground (Tomasello, 2008), much less is known about how and when infants begin to dynamically coordinate their social attention among faces, actions, and objects. One reason for the sparseness of relevant findings is that most studies obviate the need for infants choosing between different stimulus cues. Infants are typically presented with a specific sequence of events, such as an actor eliciting an infant's attention, and then looking or pointing in a specific direction, followed by an object appearing either in that direction or the opposite direction; infants merely have to attend to the stimuli in the order they appear and not choose when and what to look at (e.g., Bertenthal et al., 2014; Gredebäck, Fikke, & Melinder, 2010; Senju & Csibra, 2008). In more naturalistic situations, such as an infant interacting with a caregiver in a cluttered room among a set of objects over a more extended period of time, the caregiver might alternate between gazing at the child and the objects and jointly playing with those objects or showing them to the child. The question then becomes, how much are infants' looking behaviors guided by attention to the face or by attention to the manual actions of the caregiver, the orientation of her face, her body posture, or changes in her object-directed actions? This is a critical question because infants' systematic selection of social information in triadic interactions may not only precede but catalyze their appreciation of shared intentions or common ground. In other words, these preferences establish new opportunities for social interaction and social learning, which might very well contribute to their social-cognitive development. It is for this reason that we sought to study how infants distribute their attention during social interactions.
Recent advances in infants' eye tracking research offer important opportunities for systematically investigating how infants allocate their attention to social and non-social stimuli. Most studies, however, still rely on presenting highly scripted and repetitive actions to infants in experimental paradigms involving a live, digital image or movie of a social partner looking or reaching toward an object following an ostensive cue, such as eye contact with the viewer (e.g., Daum, Ulber, & Gredebäck, 2013; Senju & Csibra, 2008; Woodward, 1998). During the past decade, Frank and colleagues (Frank, Vul, & Johnson, 2009; Frank, Vul, & Saxe, 2012) made some important progress in studying infants' and toddlers' social attention to more naturalistic visual scenes. For instance, Frank et al. (2012) measured the visual fixations of infants and toddlers between 3 and 30 months of age while viewing short videos of objects, faces, children playing with toys, and complex social scenes involving more than one person. The results revealed that the youngest infants looked primarily at faces, and eyes in particular, but older infants and toddlers distributed their gaze more flexibly and looked more at the mouth and also significantly more at the hands, especially when the hands were engaged in actions on objects. One important question that could not be addressed by these studies is whether children's attention is directed differently to people observed from a first-person as opposed to a third-person perspective.
A more recent study by Elsabbagh et al. (2014) also studied infants' relative distribution of fixations to the eyes and mouth when viewing a social partner (observed from a first-person perspective) with eyes, mouth or hands moving or expressing multiple communicative signals (e.g., “peek-a-boo”). Consistent with previous studies, infants between 7 and 15 months of age looked at the eyes more than the mouth, but this difference was contextually modulated, such that when only the mouth moved infants looked more at the mouth than when only the eyes moved. Taken together, these last few studies suggest that by sometime during the latter half of the first year infants' social attention is controlled by both stimulus-driven factors, such as sensory (e.g., contrast, color, orientation, and motion) and social salience (e.g., faces), as well as more endogenous or goal-directed factors that can exert control of looking behavior.
The objective of the current study was to move beyond these generalizations in order to better understand how infants dynamically select their focus of attention while observing people who appear to be interacting with them. This dynamic selection of where to look is a prerequisite for joint attention. During direct gaze there is an opportunity for eye contact and communication with the social partner, whereas during averted gaze there is an opportunity for joint attention toward another person or object (Farroni, Mansfield, Lai, & Johnson, 2003; Senju & Csibra, 2008; Senju, Csibra, & Johnson, 2008). Previous eye tracking studies were restricted to reporting where infants directed their attention based on first-order stimulus information, such as faces or objects in the scene (e.g., Jones & Klin, 2013). As such, these studies ignored how contextual and social cues, such as gaze direction or actions, might orient infants to look toward a specific location. These second-order cues result in a more complex and probabilistic process, because the observer decides where to look not only as a function of the region of interest (e.g., faces, objects) but also in response to other actions as well as knowledge of the preceding events. For example, the likelihood of looking at someone's face during a conversation is much higher if that individual's gaze is oriented directly toward you as opposed to looking toward another object (Kleinke, 1986; Senju & Hasegawa, 2005). If, however, the social partner is also waving her hands or manipulating an object while looking toward you, the likelihood of looking at the face and establishing eye contact with the social partner decreases. In typical social interactions, the cues for where to look will often compete and this is especially true for young infants outside of the lab. This is the reason that we sought to study how infants guide their visual attention during more naturalistic social situations.
We measured infants' eye gaze to dynamic social scenes. Unlike the studies conducted by Frank and colleagues, the stimuli were not movies of people or cartoon characters shown from a third-person perspective such that infants were simply watching a movie. Instead, our stimuli were created to show different actors socially engaged with an observer viewed from a first-person perspective. Although the stimuli were videos, they were designed to simulate naturalistic situations that could occur between a social partner and an infant. As such, each of 16 videos presented one of five female actors talking and demonstrating a sequence of simple actions, such as putting a shirt on a stuffed animal. Since our primary goal was to conduct a detailed analysis of the changing focus of attention during joint attention, it was especially important to include both people and objects. Contrary to conventional wisdom, a few recent studies suggest that infants do not always look at the social partner's eyes or face during joint attention; instead they focus primarily on sharing attention to the same object-directed actions (Deák, Krasno, Jasso, & Triesch, 2018; Deák, Krasno, Triesch, Lewis, & Sepeta, 2014; Franchak, Kretch, Soska, & Adolph, 2011; Yu & Smith, 2013). Thus, it was especially important for us to include not only people and their gestures, but object-directed actions as well.
Three age groups were tested: 8- and 12-month-old infants, and adults. The two infant groups were selected to straddle the age at which joint attention develops and adults were included to enable a comparison of the infants' performance with more mature visual scanning behavior. Our goal was to assess the degree to which developmental changes in shifting attention to faces vs. objects was a function of the direction of head and eye gaze as well as infant-directed and object-directed actions.
We hypothesized that 12-month-old infants and adults would systematically sustain or shift attention as a function of the actors' gaze direction and actions, whereas 8-month-old infants' attentional focus would be less predictable from the actors' social cues. This prediction for 8-month-old infants was predicated on a number of specific findings: Most of the current evidence suggests that infants do not respond to gaze cues as referential prior to 9 months of age, and thus they are less likely to systematically respond to gaze direction during observation of the actions of a social partner (e.g., Johnson, Ok, & Luo, 2007; Senju et al., 2008; Woodward, 2003). There is, however, a caveat to this finding. Infants as young as 3- to 4-months of age will shift their attention in the direction of averted gaze if the target consists of moving hands and objects (Amano, Kezuka, & Yamamoto, 2004; Deák et al., 2018). Accordingly, we expected 8-month-old infants to respond to averted gaze more like 12-month-old infants when this gaze was coupled with object-directed actions. Less clear was how participants in all three age groups would respond to social cues that were incongruent (e.g., direct gaze from the viewer's face while performing an object-directed action). As we will discuss, object-directed actions were often the best predictor of when infants would share attention with the actors in the videos.
Section snippets
Participants
Twenty-two eight-month-old infants (M = 243.0-days, SD = 8.7-days; 11 females, 11 males), 20 twelve-month-old infants (M = 371.6-days, SD = 8.7-days; 7 females, 13 males), and 20 adults (10 females, 10 males) comprised the sample for this study. Two additional eight-month-old infants were tested but were excluded due to fussiness or our inability to calibrate the eye-tracking system and record valid data. Parents provided consent for their child's participation and all infants received a
Results
The main goal of this study was to test whether infants and adults modulated their attention to faces and objects as a function of gaze direction and action type. In order to address this question, it was necessary to first determine how visual attention should be measured. Although most developmental studies measure visual attention in terms of total duration of looking, we opted to measure attention exclusively in terms of visual fixations. Our eyes scan the visual world via saccadic
Discussion
A prerequisite for joint attention is that both infants and adults coordinate their focus of attention with the gaze direction and actions of their social partner. Most previous eye tracking studies presented faces in isolation, which obviated the need for joint attention. By contrast, this study presented videos of actors appearing to interact with observers so that we could precisely measure how direction of gaze and actions affect the spatiotemporal patterning of infants' gaze during more
Conclusions
This study adopted a hybrid approach to studying joint attention combining high spatial resolution eye tracking with more naturalistic social stimuli. Our results reveal that joint attention is not a monolithic process nor does it develop all at once. Indeed, it is not even possible to suggest that infants respond to different gaze cues in the same way. Our results suggest important processing differences between direct and averted gaze in triggering joint attention. Moreover, object-directed
Author contributions
Ty W. Boyer: Conceptualization, Methodology, Investigation, Software, Data Curation, Writing – Reviewing and Editing.
Samuel M. Harding: Formal analysis, Visualization, Software, Data Curation, Writing – Reviewing and Editing.
Bennett I. Bertenthal: Supervision, Conceptualization, Writing – Original draft preparation
Acknowledgements
Portions of these data were previously presented at the biennial meetings of the International Society for Infant Studies, New Orleans, LA, May 2016, and the meetings of the Psychonomic Society, New Orleans, LA, November 2018. This research was supported in part by funds from NIH Grant (U54 RR025215) to the third author. The authors wish to thank the parents and children who participated, and Jimeisha Brooks, Sloan Fulton, Jessica Luke, Keeley Newsom, and Ian Nolan for assistance in coding the
References (67)
- et al.
Infant shifting attention from an adult's face to an adult's hand: A precursor of joint attention
Infant Behavior & Development
(2004) - et al.
Is there an innate gaze module? Evidence from human neonates
Infant Behavior & Development
(2000) - et al.
Infants perceiving and acting on the eyes: Tests of an evolutionary hypothesis
Journal of Experimental Child Psychology
(2003) - et al.
From faces to hands: Changing visual input in the first two years
Cognition
(2016) - et al.
Development of infants' attention to faces during the first year
Cognition
(2009) - et al.
Speakers' eye gaze disambiguates referring expressions early during face-to-face conversation
Journal of Memory and Language
(2007) - et al.
Gaze allocation in face-to-face communication is affected primarily by task structure and social context, not stimulus-driven factors
Cognition
(2019) - et al.
Infant joint attention, neural networks and social cognition
Neural Networks
(2010) - et al.
Gaze following in human infants depends on communicative signals
Current Biology
(2008) - et al.
Understanding the referential nature of looking: Infants' preference for object-directed gaze
Cognition
(2008)
A new look at joint attention and common knowledge
Cognition
Infants selectively encode the goal object of an actor's reach
Cognition
Modelling aspects of face processing in early infancy
Developmental Science
An eye tracking investigation of developmental change in bottom-up attention orienting to faces in cluttered natural scenes
PLoS One
Coordinating attention to people and objects in mother-infant and peer-infant interaction
Child Development
GazeAlyze: a MATLAB toolbox for the analysis of eye movement data
Behavior Research Methods
Development of social attention in human infants
When do infants begin to follow a point?
Developmental Psychology
Gaze selection in complex social scenes
Visual Cognition
Mother's face recognition in newborn infants: Learning and memory
Infant and Child Development
What minds have in common is space: Spatial mechanisms serving joint visual attention in infancy
British Journal of Developmental Psychology
Social cognition, joint attention, and communicative competence from 9 to 15 months of age
Monographs of the Society for Research in Child Development
Attention-getting and attention-holding processes of infant visual preferences
Child Development
Recognizing communicative intentions in infancy
Mind & Language
The development of pointing perception in infancy: Effects of communicative signals on covert shifts of attention
Developmental Psychology
What leads to shared attention? Maternal cues and infant responses during object play
Infancy
Watch the hands: Infants can learn to follow gaze by seeing adults manipulate objects
Developmental Science
What you see is what you get: Contextual modulation of face scanning in typical and atypical development
Social Cognitive and Affective Neuroscience
Eye contact detection in humans from birth
Proceedings of the National Academy of Sciences of the United States of America
Gaze following in newborns
Infancy
Head-mounted eye tracking: A new method to describe infant looking
Child Development
Measuring the development of social attention using free-viewing
Infancy
The development of joint visual attention: A longitudinal study of gaze following during interactions with mothers and strangers
Developmental Science
Cited by (13)
Look at Grandma! Joint visual attention over video chat during the COVID-19 pandemic
2024, Infant Behavior and DevelopmentThe impact of maternal gaze responsiveness on infants’ gaze following and later vocabulary development
2024, Infant Behavior and DevelopmentMultimodal pathways to joint attention in infants with a familial history of autism
2023, Developmental Cognitive NeuroscienceVariability in infant social responsiveness: Age and situational differences in attention-following
2023, Developmental Cognitive NeuroscienceAn interactionist perspective on the development of coordinated social attention
2021, Advances in Child Development and BehaviorCitation Excerpt :The question then becomes, how much will infants' social attention be guided by the actions of the social partner, such as eye gaze, gestures and goal-directed actions, the orientation of her face, her body posture, or changes in her object-directed actions? This is a critical question because infants' systematic selection of social information in triadic interactions may not only precede but catalyze their appreciation of shared intentions or common ground (Boyer et al., 2020; Mundy & Newell, 2007). In other words, these preferences establish new opportunities for social interaction and social learning, which might very well contribute to infants' social understanding.