Elsevier

Cognition

Volume 210, May 2021, 104604
Cognition

Children integrate speech and gesture across a wider temporal window than speech and action when learning a math concept

https://doi.org/10.1016/j.cognition.2021.104604Get rights and content

Abstract

It is well established that gesture facilitates learning, but understanding the best way to harness gesture and how gesture helps learners are still open questions. Here, we consider one of the properties that may make gesture a powerful teaching tool: its temporal alignment with spoken language. Previous work shows that the simultaneity of speech and gesture matters when children receive instruction from a teacher (Congdon et al., 2017). In Study 1, we ask whether simultaneity also matters when children themselves are the ones who produce speech and gesture strategies. Third-graders (N = 75) were taught to produce one strategy in speech and one strategy in gesture for correctly solving mathematical equivalence problems; they were told to produce these strategies either simultaneously (S + G) or sequentially (S➔G; G➔S) during a training session. Learning was assessed immediately after training, at a 24-h follow-up, and at a 4-week follow-up. Children showed evidence of learning and retention across all three conditions. Study 2 was conducted to explore whether it was the special relationship between speech and gesture that helped children learn. Third-graders (N = 87) were taught an action strategy instead of a gesture strategy; all other aspects of the design were the same. Children again learned across all three conditions. But only children who produced simultaneous speech and action retained what they had learned at the follow-up sessions. Results have implications for why gesture is beneficial to learners and, taken in relation to previous literature, reveal differences in the mechanisms by which doing versus seeing gesture facilitates learning.

Introduction

When people talk, they convey information not only through spoken language, but also through gesture––movements of the hands that express meaning. Decades of research show that gestures facilitate learning, whether students produce the gestures themselves as they learn a new concept (e.g., Cook, Mitchell, & Goldin-Meadow, 2008; Goldin-Meadow, Cook, & Mitchell, 2009; Novack, Congdon, Hemani-Lopez, & Goldin-Meadow, 2014), or observe the gestures that teachers produce as they explain a new concept (e.g., Congdon et al., 2017; Singer & Goldin-Meadow, 2005; Wakefield, Novack, Congdon, Franconeri, & Goldin-Meadow, 2018). This phenomenon has been well-studied in mathematical equivalence––an important pre-algebraic concept underlying children's understanding that the two sides of an equation must be equal (e.g., 8 + 4 + 3 = __ + 3; McNeil, 2014). Researchers have found that adding gesture to spoken instruction in mathematical equivalence leads to more immediate learning gains than presenting spoken instruction alone (e.g., Goldin-Meadow et al., 2009; Wakefield, Novack, et al., 2018), and that gesture aids in establishing long-lasting (e.g., Congdon et al., 2017; Cook et al., 2008) and flexible (e.g., Novack et al., 2014) understanding of this concept.

That gesture facilitates learning is well established, but understanding the best way to harness gesture and why gesture helps learners are still open questions. Gaining a better grasp on how and when to use gesture to promote learning is important not only for theoretical reasons, but also because it will allow researchers to make specific recommendations to educators about incorporating gesture into lesson plans. Researchers have consequently begun to focus on properties of gesture that have the potential to make it uniquely powerful as a teaching tool, and to systematically study how these properties impact learning outcomes.

One property of gesture that has the potential to make it a powerful learning tool is its temporal alignment with spoken language. Gesture is synchronized with the speech it accompanies (Kendon, 1980) and listeners seamlessly integrate gesture into the speech they are processing (McNeill, 1992). Even when gesture conveys different information from the speech it accompanies (cf. Goldin-Meadow, 2003), listeners integrate the two channels to form a single representation (Cassell, McNeill, & McCullough, 1999). Listening to speech, and watching the gestures that go along with it, thus permits learners to be simultaneously exposed to different, but complementary, ideas. In fact, there is evidence that children are more likely to learn from instruction when they are given two different strategies, one in gesture and one in speech, than when they are given the same strategy in both speech and gesture, or a single strategy in speech alone. Importantly, learning is greater when the two different strategies are presented simultaneously in speech and gesture, and not when they are presented sequentially in speech (Singer & Goldin-Meadow, 2005). This finding suggests that simultaneous presentation of speech and gesture may be crucial for learning––an idea that has been proposed more broadly in dual-coding theories of learning (e.g., Baddeley, 1999; Chandler & Sweller, 1991; Mayer, 2002, Mayer, 2005). According to these theories, learners benefit from input presented simultaneously in two modalities because our ability to process information from one input channel (e.g., hearing speech) has limits and adding a second input channel (e.g., seeing gesture) helps us go beyond those limits. Simultaneous gesture and speech input can thus be understood as a specific case of dual-coding.

Recent work by Congdon et al. (2017) experimentally explores the impact on children's learning of temporally aligning input from two modalities. Congdon et al. (2017) tested whether the temporal relation between speech and gesture produced by an instructor affects child learning outcomes by directly comparing simultaneous versus sequential presentation of speech and gesture strategies during math instruction. In the study, two strategies for solving mathematical equivalence problems were used: an equalizer strategy (the idea that the two sides of an equation need to sum to the same number) and an add-subtract strategy (the idea that problems can be solved by summing the addends on the left side of the equation and subtracting the addend on the right). Children were randomly assigned to one of three groups: (1) the teacher/experimenter produced the two strategies sequentially, equalizer followed by add-subtract, both in speech; (2) she produced the two strategies sequentially, equalizer in speech followed by add-subtract in gesture; (3) she produced the two strategies simultaneously, equalizer in speech along with add-subtract in gesture. Congdon and colleagues found that simultaneity mattered––children who saw simultaneous speech and gesture during the lesson retained what they had learned better than children in the other two conditions.

The present study builds on this work by addressing a question of practical and theoretical importance––whether temporal synchrony between speech and gesture is also crucial for learning when speech and gesture are produced by the student, rather than the teacher; in other words, when the learner does gesture rather than sees someone else do it. From a practical perspective, we know that teachers and students use gesture in the classroom. But we do not know whether the recommendations we make to teachers about how they should use gesture also apply to how students should use gesture. Is it necessary for children to produce strategies in speech and gesture simultaneously, or can they produce the strategies sequentially, in order to reap learning benefits?

From a theoretical perspective, we have begun to understand the mechanisms by which gesture shapes learning, but do not know how general these mechanisms are. The temporal synchrony between speech and gesture during instruction may be necessary for learning, and it may therefore be the source of gesture's power as a teaching tool, whether the gestures are produced or observed. Alternatively, the temporal synchrony between speech and gesture during instruction may matter only when gesture is observed; when it is produced by the learner, it may not be necessary to temporally align the two modalities. There is evidence that learning outcomes differ when children produce gestures themselves versus observing an experimenter produce gestures. For example, Goldin-Meadow et al. (2012) found that children learned more on a mental rotation task when they were taught to produce meaningful gestures about translation and rotation than when they observed an experimenter producing these same gestures. In a different paradigm, Wakefield, Hall, et al. (2018) demonstrated that these learning differences extend across time. Children either produced gestures for an action, or observed an experimenter produce the same gestures, during a word learning task. They were better at remembering the newly taught word for the action 24 h later if they themselves had produced the gestures than if they had watched the experimenter produce the gestures. Similarly, in a recent meta-analysis, Dargue, Sweller, and Jones (2019) found that children showed greater learning gains when producing gesture than when observing gesture. These findings suggest that the mechanisms responsible for learning from the gestures one produces oneself, and learning from the gestures others produce, might differ.

Consistent with this view, there are hints in the literature that children can learn when they produce gesture without speech, suggesting that the temporal alignment between speech and gesture may not be that important when learners themselves produce the gesture. Cook et al. (2008) modeled an equalizer strategy for solving mathematical equivalence problems for children to produce in speech alone, gesture alone, or speech and gesture simultaneously during a math lesson. Children in all three groups improved after the lesson, but children who produced the strategy in speech alone did not retain what they learned over a 4-week delay. In contrast, children who produced the strategy in gesture, either on its own or with speech, performed well on retention measures. Children seem to be able to benefit simply from doing gesture, with or without speech. Brooks and Goldin-Meadow (2016) modeled an equalizer gesture or a control gesture (which was not interpretable in the context of a mathematical equivalence problem), neither of which was accompanied by speech, for children to produce during a math lesson. Children who produced the equalizer strategy in gesture were more likely to profit from the math lesson than children who produced the control gesture. Gesture can be a powerful force when it is in the hands of the learner, even when it is produced without speech.

In Study 1, we ask whether Congdon et al.'s (2017) findings for seeing gesture extend to doing gesture––we ask whether gesture and speech need to be produced simultaneously in order for learning to occur when learners themselves produce the gestures. We gave children who were unable to solve mathematical equivalence problems on a pretest models for problem-solving strategies in speech and gesture that they were then asked to produce during a math lesson. We then measured gains in knowledge immediately after the lesson, at a one-day follow-up session, and at a four-week follow-up session––the same time points at which Congdon and colleagues measured learning gains after learners saw the experimenter produce gestures during instruction. Children were randomly assigned to one of three groups and taught a lesson on how to solve mathematical equivalence problems (e.g., 3 + 6 + 5 = __ + 5). Children in all three groups were taught to produce an equalizer strategy (the idea that the two sides of the equation need to be equal) in speech, and a grouping strategy (the idea that the two unique numbers on the left side of the equation can be grouped and summed to arrive at the number that goes in the blank on the right side of the equation) in gesture. One group was taught to produce the two strategies simultaneously (S + G). Two groups were taught to produce the two strategies sequentially, one in which gesture preceded speech (G➔S) and one in which speech preceded gesture (S➔G). We hypothesized that if gesture facilitates learning through the same mechanisms when it is produced by learners as when it is observed by learners, children should display the best long-term learning after producing speech and gesture strategies simultaneously during the math lesson (S + G). Although we had no a priori hypothesis about whether the sequence in which speech and gesture appeared would affect learning, it is possible that one modality serves as better contextual support for the other (i.e., that gesture serves as better contextual support for speech than speech serves for gesture, or vice versa). We therefore varied the order of speech and gesture, and tested whether learning differed in the G➔S and S➔G conditions. A final possibility is that the timing between speech and gesture does not matter when learners produce the strategies themselves. If so, all three conditions (S + G, G➔S, S➔G) ought to be equally good for learners.

Section snippets

Participants

Data from 75 third-grade students (M = 9.06 years, SD = 0.52; 40 females) were analyzed in Study 1. The study focused on children of this age because third-graders typically do not understand mathematical equivalence and fail to solve problems of this format (e.g., McNeil, 2014). In order to ensure that all participants had the same starting knowledge, children were excluded from the study if they solved any of the pretest problems correctly. An additional 70 children were tested, but excluded

Results

All analyses were conducted through R Studio (version 1.2.1335), supported by R version 3.6.1 “Action of the Toes” (R Core Team, 2019). Analyses relied on the lme4 package, which allows for mixed effects modeling (Bates, Mächler, Bolker, & Walker, 2015). When running mixed effects models through lme4, dummy coding was used, the default option for coding in this package. Appropriate reference levels for factors were assigned before each model was run: for testing day, the immediate posttest was

Study 2

In Study 1, we found that, when gesture is in the hands of the learner, it facilitates long-lasting learning, whether it is produced simultaneously or sequentially with speech. This finding underscores the need to consider whether gesture is produced by students or teachers when we recommend how gesture should be used as a teaching tool. Congdon et al.'s (2017) findings suggest that teachers should use gesture and spoken instruction simultaneously; our findings suggest that this recommendation

Results

As in Study 1, we considered how children performed at immediate posttest before addressing our main question (how retention was affected by type of training). Table 2 presents the average proportion correct for all problem types and testing days by condition. Children performed best on Form A problems, the problem type used during the training session, and performed less well on Form B and C problems.

To test for statistically significant differences, we used the same approach as in Study 1. We

Comparing across studies 1 and 2

Using the same testing procedures in Studies 1 and 2 allows for comparison across action and gesture training. Although students were not randomly assigned to each study, all participants were third graders who were not able to solve any of the problems correctly before instruction. Study 1 and Study 2 were also conducted by the same three experimenters, who were blind to the hypotheses of the study. The students in the studies came from three different public schools within the same city, and

Discussion

Previous work suggests that gesture is a powerful teaching tool, in part, because it can express information simultaneously with spoken instruction and thus promote integration across the two channels. Supporting this idea, Congdon et al. (2017) found that children retained what they had learned from mathematical equivalence instruction significantly better if their teacher produced gesture simultaneously with speech than if she produced gesture sequentially with speech. However, what was not

Conclusion

Previous work has shown that the temporal alignment between speech and gesture matters for learning when the learner observes the teacher produce speech and gesture––learning and retention are better when learners see gesture produced simultaneously with speech than when they see it produced sequentially with speech (Congdon et al., 2017). We have found here that temporal alignment does not matter for learning and retention when speech and gesture are produced by the learner. Importantly, this

Funding

Funding for this study was provided by the National Science Foundation (EHR 1561405) to Susan Goldin-Meadow (PI) and Elizabeth M. Wakefield (co-PI). We also thank Casey Hall for her help with data coding.

References (32)

  • P. Chandler et al.

    Cognitive load theory and the format of instruction

    Cognition and Instruction

    (1991)
  • R.B. Church et al.

    Temporal synchrony between speech, action and gesture during language production

    Language, Cognition and Neuroscience

    (2014)
  • H.H. Clark et al.

    Quotations as demonstrations

    Language

    (1990)
  • N. Dargue et al.

    When our hands help us understand: A meta-analysis into the effects of gesture on comprehension

    Psychological Bulletin

    (2019)
  • S. Goldin-Meadow

    Hearing gesture: How our hands help us think

    (2003)
  • S. Goldin-Meadow et al.

    Gestures gives children new ideas about math

    Psychological Science

    (2009)
  • Cited by (2)

    • Over-reliance on English hinders cognitive science

      2022, Trends in Cognitive Sciences
    1

    Co-first author.

    View full text