Introduction

Collaborative learning is a promising instructional technique for learning to solve complex problems (Hesse et al. 2015). However, research shows that its benefits are not always consistent (Kester and Paas 2005; Slavin 2014). Discrepancies may be due to a lack of knowledge about the many different interacting variables involved in inter-individual activities (Hogg and Gaffney 2018). To reduce this gap, this paper first discusses the advantage of preparing groups to collaborate effectively and shows some existing knowledge gaps in the research. Second, cognitive load theory is used to suggest the advantages of preparing groups to collaborate because this would optimize the collaborative cognitive load, taking into account the effect of the distribution of task information among group members. Third, these theoretical considerations are followed by a report on an experiment that investigated the effect of prior collaborative experience and information distribution on collaborative learning and its outcomes (i.e., in short-term retention and delayed retention tests) (Kirschner et al. 2018; Sweller et al. 2011).

Collaborative learning

Collaborative learning has increasingly become important in schools and organizations. It is the process by which learners interact in small groups to learn (Slavin 2014). This instructional technique has been broadly studied from different disciplines and theoretical perspectives (Hmelo-Silver et al. 2013). Consequently, there are many strategies for designing learning environments based on group work such as structured academic controversy (Johnson and Johnson 1988), jigsaw (Aronson and Patnoe 2011), reciprocal teaching (Palincsar and Brown 1985), and division of student teams based on achievement (Slavin 1978). These techniques have been categorized as cooperative when group interactions are highly structured to achieve specific learning goals, and each learner is responsible for a part of the task. Cooperative learning strategies are mostly conceived from psychological or sociological accounts. This approach is often strictly governed by rules to aid group members in their interaction and, as such, is more directive than collaborative learning and is usually strictly controlled by the teacher (Panitz 1999). In contrast, collaborative strategies derive mostly from philosophical and political accounts that suppose that knowledge is a social construction. Here, group members are expected to share authority and responsibility amongst group members for group actions (Panitz 1999). These perspectives advocate that learners work in small groups and knowledge communities to share, dialogue, and create meaning around their knowledge and experiences (Oxford 1997). In this research, although collaborative learning is used, we have not distinguished between cooperative and collaborative learning because there are more commonalities than differences between them in terms of fostering deep learning. For example, learning happens in an active mode, the teacher plays the role of facilitator, teachers and learners share knowledge, students work in small-group activities, students must take responsibility for learning, and learners should develop team skills (Kirschner 2001).

There is considerable evidence that shows the benefits and limitations of collaborative learning. For example, the meta-analysis conducted by Johnson et al. (1981) indicated that collaboration resulted in significantly higher test performance than interpersonal competition and individualistic efforts. It also shows that collaboration with intergroup competition is better than interpersonal competition and individualistic efforts. Furthermore, it was found that task productivity (i.e., group product) and task interdependence were associated with better results, whereas for rote decoding and correcting tasks, collaboration was less effective. In another meta-analysis, Qin et al. (1995) concluded that group members outperformed individuals competing on different problem-solving tasks. Pai et al. (2015) meta-analysis found that small-group learning can promote transfer; however, they admit that additional research is needed to clarify how the structure and complexity of the task affect transfer.

In contrast, Thanh et al. (2008) found that groups sometimes did not work as expected if their learners have a strong culture of competition and dedicate much time engaged in individualistic learning. They concluded that a collaborative group would be difficult to implement in these social contexts. Another meta-analysis (Kyndt et al. 2013) concurs with this conclusion as it found that individualistic cultures often were less likely to obtain high effects under collaborative conditions. Other authors have found negative factors at individual and group level that hinder collaborative learning such as social loafing, social pressure, group conformity, the free-rider effect, and the sucker effect (see Kreijns et al. 2003; Rajaram and Pereira-Pasarin 2010). To untangle the inconclusive results about the advantages of collaborative learning, some researchers have suggested preparing groups for learning collaboratively (Cortez et al. 2009; Jurkowski and Hänzea 2016; Van den Bossche et al. 2011).

Preparing groups for collaboration

Grouping learners to learn from each other does not mean that they will work appropriately or that they will learn better (Lou et al. 1996). There are data that support the assumption that preparing learners to work together may be a way to improve collaborative learning results (Baines et al. 2007; Bischoff et al. 2012; Buchs et al. 2015; Gillies and Ashman1996; Jurkowski and Hänze 2015). For example, Prichard et al. (2006) examined the benefits of preparing learners on how to work in groups with different cohorts. They found that a cohort that received instructions on how to collaborate outperformed a cohort that was not prepared, and that the benefits of preparing for collaboration were lost when the group members split up into new groups. Buchs et al. (2015) also prepared learners by providing them with instruction on why and how to collaborate. They found that learning in dyads after 10 minutes of instruction on working together resulted in better learning results compared to learning individually or collaboratively without such instruction. Similarly, Jurkowski and Hänze (2015) used a 100-min session for training students about transactive communication to enhance group communication and knowledge acquisition during collaborative learning. Their results showed that trained groups outperformed and displayed more transactive communication than untrained groups.

Others investigations show that learners with prior group preparation can allocate effective communication patterns to efficiently complete a task (Jurkowski and Hänzea 2016), exchange elaborated explanations and constructive activities (Webb et al. 1995), and effectively distribute high task demands amongst themselves and monitor their contributions (Fransen et al. 2011). Once groups have acquired task and team schemas (i.e., a shared mental model, Van den Bossche et al. 2011), they may better focus their interactions on learning tasks and obtain better learning. Conversely, a group without such prior experience may perform interactions that may be irrelevant to the task. These data suggest that groups may obtain higher test scores and be more efficient when receiving guidance on how to collaborate on relevant tasks (Jurkowski and Hänze 2015; Kirschner and Erkens 2013; Stevens et al. 1991).

Among the limitations of the perspective that advocates preparing groups for collaboration is the lack of attention to the factors that may affect the quality of the interactions and whether effects are long-lasting (e.g., on delayed retention tests after 1 week) (Soderstrom and Bjork 2015). Inter-individual processes may result in different outcomes depending on the test timing, characteristics of the group members (e.g., learners with prior collaborative experience) and the demands of the task. Cognitive load theory may help to understand how task complexity affects the performance and mental effort of collaborative learning.

Cognitive load theory and collaborative learning

Cognitive load theory is an instructional theory based on the human cognitive architecture that underlies inter-individual activities (Sweller et al. 2011). According to the theory, acquiring new domain-specific knowledge depends on working memory limitations that may not allow processing of more than about two elements at once (i.e., processing around two elements at once; Cowan 2010). If the tasks require processing many highly interacting elements in a limited amount of time, learners will need to execute many cognitive operations, which increases cognitive load. Cognitive load refers to the working memory load intensity when performing cognitive activities to achieve a specific learning goal (Kalyuga and Singh 2016). This load is intrinsic if it refers to processing essential information of learning tasks, or extraneous if it is caused by instructional procedures. Germane cognitive load refers to working memory resources available to deal with intrinsic cognitive load (Sweller 2010). Optimal instruction for novices should reduce extraneous load and maintain intrinsic load without exceeding working memory capacity. If intrinsic cognitive load is low, extraneous cognitive load differences may have less effect because working memory limits may not have been exceeded. Once a learner has stored task information elements in long-term memory, they can be recovered as an encapsulated element, freeing up working memory resources for processing new information (Sweller 2010).

Cognitive load theory findings mostly apply to individual learning conditions. However, collaborative learning is gaining attention from cognitive load researchers (Kester and Paas 2005). In group learning settings, one factor that may influence cognitive load, in addition to the interacting information elements of the task, is transactional activities consisting of communication and coordination activities among group members that are specific to collaborative learning. Working together is necessary when performing a group task and, as such, transactional activities play a critical role in determining the advantages and the limitations of collaborative learning (Kirschner et al. 2018).

Collaborative learning seems to work better when learning tasks are cognitively demanding. Studies conducted by Kirschner et al. (2011a, b) suggest that group learning is more efficient when tasks are highly complex. Kirschner et al.’s studies found that the task should be complex enough to justify investing working memory resources on transactional activities. However, if tasks had a low level of complexity, transactional activities are unnecessary and even detrimental compared to individual learning. These investigations suggest that distributing information-elements of high-complexity tasks amongst learners may increase test scores and cognitive efficiency because information elements are processed by more working memories (the collective working memory effect, Kirschner et al. 2011a). Further, cognitive load imposed by transactional activities may be lower compared to the load associated with processing all information elements by one learner.

Other studies conducted by Retnowati et al. (2010, 2016) suggest that collaboration may not improve learning in high-complexity tasks compared with individual learning, depending on the instructional procedure being followed. They investigated the effect of conventional problems and worked-out examples on individual and collaborative learning and found that in some high-complexity tasks, individuals performed better than groups. They also found that collaborative learning was more beneficial than individual learning in solving problems but not in studying worked examples.

Optimizing transactional activities

In tasks that should be performed individually (i.e., that do not require collaboration), transactional activities impose an extraneous cognitive load because communication and coordination activities are not essential components. If the task is collaborative in nature, transactional activities are a type of intrinsic cognitive load. In either case, collaborative load should be optimized through instructional procedures to achieve the learning goals (Kirschner et al. 2018).

Prior collaborative experience as generalized domain knowledge

Literature about preparing learners for collaboration suggests that learning in groups may be more beneficial when the members of the group receive explicit guidance on how to work together (see section Preparing Groups for Collaboration). Providing collaborative experiences with high-complexity tasks may help learners acquire shared mental models of joint work (Van den Bossche et al. 2011) that can guide their transactional activities during collaborative learning. This does not mean that collaborative learning is a kind of general knowledge that can be applied to any domain of knowledge indiscriminately. This general knowledge perspective fails to take into account that the characteristics of the multiple types of learning tasks can result in different forms of joint work and that there are many ways to learn collaboratively. This premise suggests that it is better to prepare learners to collaborate according to particular characteristics of a task or domain. Task- or domain-based collaborative experience may help learners to generalize those skills that are unique to that learning environment (Bischoff et al. 2012).

Prior collaborative experience is a factor that has not yet been explored using cognitive load theory. However, the emerging construct of generalized domain knowledge may imply this experience. While domain-specific knowledge applies to a narrow range of specific tasks in the domain, “generalized domain knowledge applies to a wider class of different tasks in this domain [and] it remains a part of domain-specific knowledge” (Kalyuga 2013, p. 1479). Thus, it is plausible to assume that when group members solve together domain-specific tasks, they also construct relevant shared schemas of collaborative processes that can be transferred to other similar tasks (Gick and Holyoak 1983). This group experience may be a domain group schema (i.e., a generalized domain skill at group level) that is stored in long-term memory to solve similar learning problems (Zambrano et al. 2019). Furthermore, as is the case for any relevant knowledge structure, group experience may work as an internalized guidance that regulates transactional activities, optimizes collaborative cognitive load, and leads to better learning outcomes (Hagemann and Kluge 2017; Jurkowski and Hänze 2015; Van den Bossche et al. 2011; Zambrano et al. 2018).

Element interactivity and information distribution

The number of interacting elements to be temporally processed in working memory is the major source of cognitive load (Sweller 2010). An element can be considered as a schema that needs to be learned (e.g., a number or a set of steps to solve a mathematic problem). Any change in the elements, either in the task or in the long-term memory structure, alters the cognitive activity of working memory (Sweller et al. 2011). Consequently, variations in element interactivity may explain all cognitive load theory effects.

When learning new tasks, different ways of distributing information amongst group members may affect transactional activities and in turn collaborative learning outcomes (Kirschner et al. 2018). Mostly, investigations address the effect of information distribution from the hidden profile paradigm. From this perspective, relevant items are distributed in a way that group members are led to prefer a suboptimal solution alternative, while only the combined information uncovers the best solution (Deiglmayr and Spada 2010; Stasser and Titus 2003). However, information distribution has not been experimentally studied with learning problems from a cognitive load theory perspective. Despite this gap of knowledge, it is possible to anticipate specific results based on element interactivity.

Groups are viewed as information processing systems with more cognitive capacity than individual learners (Hinsz et al. 1997). This increased working memory-advantage is especially crucial when tasks are highly complex (Kirschner et al. 2011a). However, having a larger cognitive reservoir may have no effect when the way of distributing task information amongst members unnecessarily increases transactional activities harming learning (Deiglmayr and Spada 2010). If the information of a learning task is distributed so that one group member can solve one step of the problem, but then s/he communicates his/her partial result with others to solve the whole task collaboratively, the intensity of the cognitive load may decrease. Reducing the number of inter-individual activities and the associated cognitive load may free working memory resources for creating a better mental representation of the task. As a result, test scores and cognitive efficiency of collaborative learning may increase. Conversely, if no step of the problem can be performed without all members sharing and discussing each of their information elements, group processing intensity may increase which may impose an additional cognitive load and impair learning.

The present study

Based upon the aforementioned, this study examined the effects of prior collaborative experience on relevant tasks (experienced groups vs. inexperienced groups), and information distribution (low vs. high information density) on the performance of collaborative learning and its outcomes on short-term retention and delayed retention tests. We expected that experienced groups would focus their cognitive resources on better transactional activities, thus increasing test scores (Hypothesis 1) and reducing cognitive load leading to increased efficiency (Hypothesis 2) than inexperienced groups. Lower information density should decrease cognitive load because learners require fewer transactional activities amongst themselves, leading to higher test scores (Hypothesis 3) and lower cognitive load with increased efficiency (Hypothesis 4) than higher information density. Therefore, it can be expected that for a task with higher information density, prior collaborative experience allows groups to increase test scores (Hypothesis 5) and decrease cognitive load leading to increased efficiency (Hypothesis 6) than inexperienced groups. However, in tasks with lower information density, the advantage of having prior collaborative experience is redundant leading to a reduction in the difference in similar test scores (Hypothesis 7) and a reduction in the difference in efficiency (Hypothesis 8) between experienced and inexperienced groups.

Method

Participants

The study was conducted with 240 high-school Ecuadorian students from a large public school in Quito as part of the mathematics classes. The gender distribution was 59 female and 181 male, and the average age was 15.58 years (SD = .84). No difference in prior knowledge was expected because the learning phase tasks are not included in the content of the very strict Ecuadorian national curriculum which explicitly prohibits the teaching of non-prescribed topics. Further, teachers confirmed that they had not previously taught the included content and that all participants came from the same school. The use of random assignment to all conditions excluded any systematic prior knowledge differences. Despite curricular restrictions, this research received approval from the School Ethical Committee (official communication 007-VCEM/15-16) as part of their program of learning improvement. Learners were notified of the study, that their participation was voluntary and that they would receive an academic compensation of 10 points for participation.

Design and procedure

A 2 (group experience: experienced vs. inexperienced group) × 2 (information distribution: low information density vs. high information density) factorial design was used. The study was conducted in four phases with 45-min sessions: preparation, learning, short-term retention test, and delayed retention test. Three instructors and an experimenter carried out the study. Instructors were previously informed about the procedure and were supervised by the experimenter to ensure condition fidelity. Guidelines were read aloud, and a digital clock was used to show the number of minutes allotted to each task. Time for each task of the phases was established through a pilot study showing the amount of time needed to solve a task without high time pressure. Because data from the learning, short-term retention and delayed retention phases were analyzed independently, if a learner who participated in the learning phase did not participate in the short-term retention test, they were allowed to participate in the delayed retention test.

Preparation

This phase aimed to prepare groups to undertake high-complexity collaborative tasks using quadratic equations (i.e., ax2 + bx + c = 0, where a, b, and c represent constants and a ≠ 0). It began in the second week of the new school year after a two-month vacation. It was ensured that participants had no previous classroom familiarity with each other nor prior collaboration experience within the last two months. Participants were randomly assigned to two conditions: one half worked in 3-person groups (experienced group condition), and the other half worked individually forming the inexperienced group condition in the next phase (i.e., the learning phase). All worked in four sessions, one session per day over one week. Both conditions worked on the same tasks. The first tasks had no time constraints. The last two tasks of the second session onward had to be solved within 10 minutes. While performing the tasks, each team member was required to interact with other members in order to share their items and maintain partial results in working memory. At the end of each session, participants received the correct answers and were required to spend five minutes on planning how they could work better on subsequent tasks.

Learning

This collaborative learning phase was conducted in one session after the preparation phase. Groups that had not completed all preparation phase sessions were excluded. Random absences were caused by the school administration asking students to perform activities related to the beginning of the school year. These absences unbalanced the number of planned groups per condition. However, as participants had learned to solve quadratic equations in the previous year and all dropouts occurred before the learning phase, it was not necessary to analyze whether there was a difference between the excluded learners and those who remained. Further, an a priori analysis with a power of .8 and a medium-size effect (i.e., .06; Cohen 1988) revealed that the study needed 31 participants (11 triadic groups) per condition indicating that the remaining participants were sufficient to reliably test the hypotheses.

Learners who had worked individually were randomly distributed into 26 groups of 3-persons (i.e., inexperienced group condition), while 39 experienced groups remained intact. All groups were randomly assigned to two conditions of information distribution (i.e., low information density and high information density). For the experienced group condition, 18 groups received low information density and 21 the high information density materials. For the inexperienced group condition, 15 groups received the low information density and 11 the high information density materials. All groups worked on three tasks for 27 minutes,  nine minutes per task. Instructors encouraged groups to focus on the task and avoid unnecessary conversations. Writing in this phase when performing calculations was not permitted to prevent cognitive off-loading through external representations (Van Bruggen et al. 2002). Only one group member was allowed to write down the answer for each task. If a group solved the problem before the allotted time, that group had to wait to start the next problem.

Short-term and delayed retention tests

Short-term and delayed retention tests were conducted one and seven days after the learning phase respectively. Participants were individually required to solve three similar problems with 10 minutes for each problem. The number of participants is shown in Table 1. In both phases, participants recorded the mental effort after each problem. Unlike in the learning phase, writing down calculations was permitted.

Table 1 Participants of the short-term retention and the delayed retention tests

Materials

Learning materials were in the domain of mathematics and the comparable domain of economics. Quadratic equations were used in the preparation phase and break-even point problems in commercial transactions (the point at which a transaction resulted in neither a profit nor a loss) in the learning phase, as well as in short-term and delayed retention test phases. All materials were paper-based.

Preparation

Quadratic equations are compulsory in the national curriculum, and all students had already learned to solve them the previous year. In the first session, participants received a booklet whose first part introduced quadratic equations with two worked examples using the factoring method. The second part presented rules on how to solve the equations collaboratively, followed by a worked example demonstrating how each member should apply the rules and a conventional task with the correct answer (Appendix Table 5). Examples of the rules are: When it is possible to perform the calculations without the help of others, do it alone and continually rehearse the results to avoid forgetting them, and Solving an equation will require many partial answers in your mind; decide who will have which partial result in your group; it is better that everyone has a result to avoid forgetting them or making a mistake in solving the equation.

Quadratic equation values were manipulated to provide group experience on the information distribution for the learning phase tasks. Equation values were unpacked to distribute them among learners (e.g., for − 15x2, each member would receive − 5x2). It required group members to depend on others’ information to solve the problem. Individual participants (who were members of inexperienced groups in the learning phase) received the same values to solve the equations individually.

In the second session, groups and individuals again received the collaborative learning rules, two conventional problems with the correct answer and a conventional problem without the correct answer. In the third and fourth session, groups and individuals received three problems without correct answers. The values provided were relevant but were insufficient to solve the problem.

Learning

Calculating a break-even point is considered to be an analog task to solving quadratic equations because it displays similar characteristics such as combining multiple numerical values, calculating partial step answers, holding them in working memory, and finding a unique correct answer. Participants received a booklet introducing the relevant concepts with two worked examples, questions to prompt them, three learning tasks, and a piece of paper with examples of costs and the break-even point in the units’ formula. One worked example showed the students how to calculate the break-even point in units and sales with a profit margin. The other worked example was similar but without a profit margin. The worked examples had a 7-step procedure (see Table 2). Examples of the prompt questions were: (a) What were the break-even points? (b) What were the seven steps to calculate the break-even points? (c) What was the difference between the break-even points in units and sales? (d) How did you calculate the contribution?

Table 2 Steps and information elements to calculate the break-even points

Task complexity was first checked by presenting the tasks to two Economics teachers and then questioning them as to the complexity. These teachers confirmed that the tasks were complex enough for novices. Also, the Sweller and Chandler method (1994) to determine complexity was used, which consisted of counting the approximate number of interacting elements. As can be seen in Table 2, problem-solving had seven steps and nine items. The items were three fixed costs, three variable costs, price, profit margin, and produced articles. Each step varied in the number of interacting items (Column 3 of Table 2). This amounted to a total of 45 items (including mathematical signs). For each step, a partial answer had to be calculated and held in working memory (Column 4 of Table 2) to be integrated with another partial answer. Writing was not permitted. The high complexity of the tasks was confirmed by the mean for mental effort in the learning phase tasks which was 7.38 on the 9-point scale (see “Measurement” section).

All groups received the same tasks, but with different information distributions. For the low information density groups, steps 2 and 5 could be solved without communication or coordination between peers. A group member only needed to share the partial answer to calculate the other steps and find the final answer. The items were balanced so that all group members had all task information during the learning phase. For example, for the first task, member 1 received three variable costs, for the second task three fixed costs and the profit margin, and for the third task, price and the produced number of things. For the high information density groups, no step could be performed without each member communicating his/her items to others and coordinating their calculations. To avoid confusion during the learning processes, members were given different examples of fixed and variable costs with the formula for the break-even point in units (see Step 6 of Table 2).

Short-term and delayed retention tests

Six high-complexity problems were used for testing. The problems were similar to the learning tasks, but the business situation and cost names were varied. Participants received three tasks a day after the learning tasks (i.e., short-term retention test), and the other three seven days after the learning tasks (i.e., delayed retention test). Each problem included a table with seven rows to write down the calculations for each step of the task’s solution.

Measurement

Performance

Performance was measured in the learning, short-term retention test, and delayed retention test phases. The total number of points that could be scored for all the three learning tasks was 3, 1 per task, if an answer was correct. If an answer was incorrect, the task scored was 0. For each of the three short-term retention test tasks, 7 points could be awarded. These points were based upon the 7 calculations required to determine the break-even point. Each calculation was scored individually when considering whether correct values and mathematical operations were used. A correct step’s calculation received 1 point and an incorrect step’s calculation 0. This resulted in a maximum score of 21 points and a minimum of 0. If a step was partially correct, a proportional score was given. The same scores were applied to the delayed retention test’s tasks. The scores were transformed into proportions.

Cognitive load

Cognitive load was measured after the third task of the learning phase and after each task in the short-term and delayed retention test phases using a subjective 9-point mental effort scale (Paas 1992). The collaborative cognitive load of the learning phase was calculated averaging the mental effort scores of the members. Individual scores for mental effort were used in the other phases.

Efficiency

Efficiency (E) refers to the quality of learning as result of combining performance and mental effort (Paas and Van Merriënboer 1993). A high efficiency denotes relatively high performance in combination with relatively low mental effort. By contrast, low efficiency means relatively low performance with relatively high mental effort. Efficiency was computed by standardizing each of the participant’s scores for task performance and the mental effort. For each participant, z-scores were calculated for effort (R) and performance (P) using the formula E = [(P − R)/21/2].

Results

Data were analyzed with 2 (group experienced: experienced vs. inexperienced group) × 2 (information distribution: high information density vs. low information density) multivariate analyses of variance (MANOVA) and analyses of variance (ANOVA). Dependent variables were performance, mental effort, and efficiency, which were measured and independently analyzed for the learning, short-term retention, and delayed retention phases. Descriptive statistics are shown in Table 3. Partial eta-squared was used to determine the effect size with values of .01, .06 and .14, corresponding to small, medium and large effects respectively (Cohen 1988).

Table 3 Mean and standard deviations for dependent variables of study phases

Learning phase

MANOVA revealed significant main effects for group experience, F(2, 60) = 10.40, Wilks’ Λ = .74, p < .001, \(\eta_{\text{p}}^{2}\) = .26, and information distribution, F(2, 60) = 3.33, Wilks’ Λ = .90, p = .04, \(\eta_{\text{p}}^{2}\) = .10, which indicate that these variables affect a combination of performance, mental effort, and efficiency scores. The interaction between these effects was nonsignificant, F(2, 60) = .67, Wilks’ Λ = .98, p = .52, \(\eta_{\text{p}}^{2}\) = .02.

Two-way ANOVAs were conducted to examine the variables separately (see Table 4).

Table 4 Two-way analyses of dependent variances for each phase

Concerning performance, ANOVA revealed that experienced groups (M = .52, SD = .42) outperformed inexperienced groups (M = .13, SD = .21). For mental effort, groups with low information density (M = 6.80, SD = 1.80) perceived lower mental effort than groups with high information density (M = 7.89, SD = 1.68). For efficiency, experienced groups (M = .32, SD = 1.26) were more efficient than inexperienced groups (M = –.48, SD = .65) and low information density (M = .29, SD = 1.09) was more efficient than high information density (M = − .30, SD = 1.11).

Short-term retention test phase

MANOVA revealed a significant main effect for group experience, F(2, 173) = 9.07, Wilks’ Λ = .91, p < .001, \(\eta_{\text{p}}^{2}\) = .10, and information distribution, F(2, 173) = 3.60, p = .03, Wilks’ Λ = .96, \(\eta_{\text{p}}^{2}\) = .04. This suggests that both independent variables affect performance, mental effort, and efficiency simultaneously. The interaction between these effects was nonsignificant, F(2, 173) = 2.69, Wilks’ Λ = .97, p = .07, \(\eta_{\text{p}}^{2}\) = .03.

For performance, ANOVA (see Table 4) revealed that experienced groups significantly outperformed (M = .45, SD = .22) inexperienced groups (M = .31, SD = .29). It also showed that groups with low information density (M = .42, SD = .25) outperformed high information density (M = .36, SD = .27). Concerning mental effort, experienced groups (M = 6.47, SD = 2.51) reported more mental effort than inexperienced groups (M = 5.53, SD = 2.58). Regarding instructional efficiency, the significant interaction between main effects indicated that for the task with high information density, experienced groups are more efficient than inexperienced groups, (p = .02, \(\eta_{\text{p}}^{2}\) = .03). However, for the task with low information density, experienced groups and inexperienced groups are not significantly different (p = .37, \(\eta_{\text{p}}^{2}\) = .01).

Delayed retention test phase

MANOVA yielded a significant main effect for group experience, indicating that this variable affects performance, mental effort, and efficiency, F(2, 176) = 8.64, Wilks’ Λ = .91, p < .001, \(\eta_{\text{p}}^{2}\) = .09. The main effects for information distribution, F(2, 176) = 1.07, Wilks’ Λ = .99, p = .35, \(\eta_{\text{p}}^{2}\) = .01, and the interaction between these effects, F(2, 176) = 2.34, Wilks’ Λ = .97, p = .10, \(\eta_{\text{p}}^{2}\) = .03, were nonsignificant.

For performance (Table 4), the analysis revealed that experienced groups (M = .46, SD = .30) outperformed inexperienced groups (M = .29, SD = .26). For instructional efficiency, experienced groups (M = .09, SD = .83) were more efficient than inexperienced groups (M = -.12, SD = .75). The significant interaction between main effects indicated that for the task with high information density, experienced groups are more efficient than inexperienced groups, (p = .01, \(\eta_{\text{p}}^{2}\) = .04). However, for low information density, experienced and inexperienced groups are not significantly different (p = .90, \(\eta_{\text{p}}^{2}\) = .00).

Discussion

The goal of this study was to examine the effect of prior collaborative experience (i.e., experienced vs. inexperienced groups) and collaborative intensity related to task information distribution (i.e., high information density vs. low information density) on test scores and cognitive load during collaborative learning. We discuss the result for each hypothesis.

It was hypothesized that groups with prior collaborative experience would obtain higher test scores (Hypothesis 1) and lower cognitive load resulting in increased efficiency (Hypothesis 2) than inexperienced groups. Results confirmed the expectation for increased test scores following prior collaborative experience in all phases and increased efficiency in the short-term and delayed retention tests. These are the primary results of this experiment.

The results suggest that prior collaborative experience in similar tasks was transferred to new complex learning tasks (Kalyuga 2013) and helped to optimize the cognitive load associated with transactional activities (Kirschner et al. 2018). Acquiring collaborative schemas based on similar tasks permitted groups to deal with the high cognitive load of high information density. It seems that working memory resources invested in transactional activities were used to construct high-order schemas of the learning tasks (Van den Bossche et al. 2011). This result is in line with Fransen et al. (2013) in the sense that experienced groups developed group and task schemas. In contrast, inexperienced groups could not handle the high cognitive load leading to the construction of poor knowledge of the tasks. An interesting result is that inexperienced groups reported a lower mental effort in the short-term retention phase which significantly decreased efficiency in experienced groups. One possible explanation for this result is that their lower knowledge level may have reduced their assessment of the complexity of tasks, overestimated their current performance which in turns decreased their mental effort ratings (Nugteren et al. 2018).

Concerning information distribution, we expected that low information density decreases cognitive load because learners require fewer transactional activities amongst themselves, leading to higher test scores (Hypothesis 3) and efficiency (Hypothesis 4) than high information density. Results supported these hypotheses for performance in the short-term retention test and for efficiency in the learning phase. It seems that the advantage of some group members being able to solve part of the problem individually affected transactional activities during the learning phase increasing efficiency. But this benefit only improved the performance in short-term (i.e., the short-term retention test), and faded out in the delayed retention test.

As the distribution of the amount of information alters intrinsic cognitive load (Sweller et al. 2011), it is intriguing that differential information distribution did not achieve the same impact as the prior collaborative experience (see effect sizes in MANOVAs). This result might be explained by the positive interdependence acquired by the experienced groups. A relevant study that may support this explanation is provided by Johnson et al. (1991) who compared the impact of positive goal interdependence and resource interdependence. They found that groups with positive goal interdependence exhibited better results than groups with positive resource interdependence. Our data seems to coincide with their results in the sense that shared schemas on how to work on relevant tasks may have more decisively affected collaborative learning than the interdependence based on information distribution.

Regarding the expected higher test scores (Hypothesis 5) and efficiency (Hypothesis 6) of experienced groups in tasks with higher rather than lower information density, the results did not yield evidence for performance. However, the experienced groups were more efficient in the short-term and delayed retention tests. The higher efficiency suggests that task-based collaboration schemas could be activated and transferred to the learning tasks. Although inter-individual activities under high information density conditions were more intense (i.e., more communication and coordination activities), it seems that experienced groups optimized the cognitive load and learned to solve problems more efficiently. The lack of significant results in the learning phase suggests that the advantages of collaboration are not always observable immediately (Soderstrom and Bjork 2015). Subsequent individual post-tests revealed the benefits of having acquired collaboration schemas.

Concerning the reduction in the difference in similar test scores (Hypothesis 7) and efficiency (Hypothesis 8) between experienced and inexperienced groups when learning with low information density, results supported this expectation in all phases of the study with no significant differences on these measures. Data suggest that prior collaborative experience may be redundant when tasks have a low level of complexity in terms of inter-individual activity. Learners may have devoted working memory resources to harmonizing their schemes of working together with the low information density that required less group interaction. The experience of sharing each item and performing shared calculations for each problem step may have interfered with individual calculations. This could have unnecessarily increased the amount of inter-individual activities and the cognitive load impairing performance and efficiency.

The results of this study allow us to conclude that prior collaborative experience on relevant tasks and how the interacting information is distributed amongst learners are promising research lines that can improve our knowledge about collaborative learning. Data supported the assumption that grouping learners does not necessarily lead to better learning. For this reason, providing collaborative schemas using relevant tasks may help to improve group performance and member learning. This group advantage is crucial in learning situations where the tasks are complex (i.e., high level of element interactivity) and information distribution among group members demands a high level of intra-group activity.

Instructional design of collaborative learning should consider interacting information elements of a task and the cognitive load associated with transactional activities. Cognitive load theory assumes that any learning task is divisible into meaningful elements, and its distribution in collaborative settings may result in fundamental differences at group and individual level. For this reason, given that the goal of learning was to solve problems individually, firstly it is important not lose sight that the performance and efficiency of collaborative learning must be evaluated in terms of individual learning of group members (Kirschner et al. 2009). Accordingly, collaborative learning is better when it promotes better individual learning.

Secondly, students who learn in groups with high-interactivity level tasks require relevant group work schemas (Zambrano et al. 2019). Learning tasks needed to be solved with all information elements provided to group members, and information distribution was varied to test its effects on inter-individual activities. From cognitive load theory, transactional activities are complex meaning-making cognitive operations whose load in working memory may foster or inhibit the acquisition of schemas in long-term memory (Tindale and Kameda 2000). As coordination and communication activities during learning may impose a high cognitive load, it seems that task-based group schemas help groups to better guide their actions to achieve higher effectiveness.

Our findings have important instructional implications when learning from collaborative high-complexity problems. If learners are novices, learning tasks are complex, and information distribution demands high inter-individual activity, teachers should prepare group members prior to collaboration using similar problems that are already known to them so that they learn to work together. During the learning phase, members should receive the conceptual and procedural knowledge to solve the problems. The distribution of information should be balanced among all group members so that everyone has the same opportunity to process all types of information elements. Because each member has only a part of the information, communication and coordination processes help students to acquire better mental representation of the tasks. If task information does not demand high interactivity among group members, it is not necessary for the teachers to prepare the learners to collaborate.

This study has some limitations. It is necessary to identify which specific factors are associated with the prior collaborative experience and the cognitive load they impose during learning (Janssen et al. 2010). Future research should explore group composition, such as whether the benefits decrease when new groups are formed with members who differ in their experience (Prichard et al. 2011). Further, more investigation is needed during class periods to determine how social factors such as friendship between learners or emotional regulation skills affect information distribution and its cognitive load.