Introduction

Selective attention refers to the preferential processing of a subset of the available information in the environment along with some concomitant decrease in the processing of the remaining information (Chun et al., 2011; Treisman & Gelade, 1980). At least two specific aspects of selective attention are of particular relevance to the current work. The first aspect is that selective attention is limited in its capacity. This is true whether the unit of interest is number of objects, feature dimensions, spatial locations, or temporal intervals. Only so much information can receive the benefit of preferential processing – an idea inherent in the term “selective.”

The second aspect is that the selection is imperfect. Consider a task in which participants are presented with five arrows arranged horizontally and are told that they need to indicate the direction of the middle arrow as quickly as possible. Given such a task, participants will consistently respond more quickly if the four surrounding arrows point in the same direction as the middle arrow, as compared to if the surrounding arrows point in the opposite direction of the middle arrow (Fan et al., 2002; Rueda et al., 2004). Thus, despite the fact that the surrounding arrows are completely task irrelevant and should be unattended, they are clearly not perfectly selected out (i.e., if they were perfectly selected out, their orientation wouldn’t influence reaction time). Selective attention is therefore considered to be a process that biases the extent to which certain information is processed, rather than a process that perfectly selects some information for additional processing while fully filtering out the remaining information (Eriksen & St. James, 1986; Ma & Huang, 2009; Posner & Petersen, 1990; Treisman & Gelade, 1980).

These two aspects of selective attention, capacity limitations and selectivity, have been of significant interest in the field – in particular the extent to which these aspects are independent or if they interact. Load theory, first proposed by Lavie and colleagues, has strongly argued for the latter (Lavie et al., 2004; Lavie & Tsal, 1994). Load theory predicts that irrelevant items are disproportionately likely to be processed when there are few relevant items present. In essence, when few relevant items are present, there is otherwise unallocated attentional capacity, which then unavoidably “spills over” to irrelevant items. When a large number of relevant items are present, attentional capacity is saturated and there is a reduction in the degree to which irrelevant items are processed (i.e., there is no capacity remaining to spill over to irrelevant items).

Investigations into load theory have most commonly employed a modified visual search paradigm (e.g., Huang-Pollock, Carr, & Nigg, 2002; Lavie, 1995; Lavie et al., 2004; Marciano & Yeshurun, 2017; Maylor & Lavie, 1998). Participants in such a paradigm are asked to search within a pre-defined number of locations (e.g., within one of eight possible rings arranged in a circle) for one of two possible target shapes (e.g., a square or a diamond), and to indicate which of the two was present in each trial display. The measure of interest in this task is not simply whether or not the target shape was found or the speed with which it was found. Instead, the measure of interest involves an assessment of the influence of another shape that appears outside of the set of pre-defined search locations and thus is always task irrelevant. The irrelevant item can either be the same shape as the target shape that appears on the given trial (e.g., if the target shape that is present was a diamond, the irrelevant shape is also a diamond) or it can be the possible target shape that is not present on the trial (e.g., if the target shape that is present was a diamond, the irrelevant shape is a square). Individuals usually respond more quickly when the target and irrelevant shape match as compared to when they do not match (i.e., there is a compatibility effect). The critical measure with respect to load theory is the extent to which the magnitude of the compatibility effect differs as a function of the number of shapes that are present in the possible target locations. On trials where only the target is present (i.e., the other possible locations are empty), load theory suggests that there is a great deal of unallocated attentional capacity, which should then naturally spill over to processing the irrelevant shape. When the target and irrelevant shape match, because both are being strongly processed, this match will greatly speed up reaction times. When the target and irrelevant shape conflict, again, because both are being strongly processed, this mismatch will significantly retard reaction time. The additional processing of the irrelevant shape therefore serves to increase the magnitude of the compatibility effect. Conversely, on trials when there are many shapes present within the possible target locations, attentional capacity is saturated, leaving no capacity remaining to spill over to the irrelevant shape. With no attentional resources being devoted to the irrelevant shape, reaction times are expected to be equivalent regardless of whether that irrelevant shape matches or does not match the target shape. Consistent with perceptual load theory, a substantial number of empirical papers in the domain have observed such an outcome, with the magnitude of compatibility effects decreasing with greater amounts of perceptual load in the target task (e.g., Chen et al., 2008; Green & Bavelier, 2003; Kim, Kim, & Chun, 2005; Rees, Frith, & Lavie, 1997; Yi, Woodman, Widders, Marois, & Chun, 2004; for a review, see Murphy, Groeger, & Greene, 2016).

Despite its clear utility in explaining a number of such empirical results (Chun et al., 2011), load theory has faced criticisms. For instance, Marciano and Yeshurun (2017) examined several of their own studies using a task modeled after Lavie et al. (2004). In nine experiments from two published studies using this task, group-level analyses of only two experiments indicated the interaction between targets and distractors predicted by load theory. Furthermore, when examining individual participants’ data, it was clear that a minority of individuals produced patterns corresponding to the predicted target-distractor interaction. That is, across studies, fewer than half of participants demonstrated a decrease in target-load effects when distractor-load was increased. Instead, individual participants showed all possible combinations of target and distractor effects. Furthermore, average (i.e., group-level) effects frequently produced an interaction between targets and distractors opposite to the interaction predicted by load theory. When following up with a subset of participants who demonstrated the predicted load effect in an initial session, these same participants did not consistently demonstrate the same pattern on subsequent sessions. From this finding it was suggested that load effects may be reliable only in certain populations or tasks. Another possible explanation would be that the load effect size is simply much smaller than previously thought. In the latter case, aggregate statistics could produce canonical load effects while individuals’ noisy estimates would often not conform to this pattern. Aggregated load effects would reflect a real phenomenon that is difficult to measure within individuals.

Together the results of Marciano and Yeshurun (2017), in particular the lack of individual-level load effects and within-participant reliability, motivate at least three possible lines of inquiry regarding load theory:

  1. 1)

    Task: Are the effects observed in the load theory literature idiosyncratic to a certain task or tasks? Are the predictions of load theory confirmed in novel tasks?

  2. 2)

    Population: Is the load effect stronger or weaker in different populations (e.g., individuals with greater cognitive control may have an overall lower distractor interference effect; Kane & Engle, 2003; also see Maylor & Lavie, 1998; Remington, Cartwright-Finch, & Lavie, 2014), potentially causing the calculation of this effect to be overwhelmed by noise when studying high-performing populations?

  3. 3)

    Cross-task reliability: If the effect is indeed robust for a given task and population, are the observed effects task-specific or do they indicate patterns of processing that are shared with other tasks (i.e., cross-task correlations in these effects)?

Although Marciano and Yeshurun (2017) found that neurotypical young adults do not always demonstrate reliable load effects, children have uniformly inferior capacity and selection processes compared to young adults (Cowan, Nugent, Elliott, Ponomarev, & Saults, 1999; Dye & Bavelier, 2010; Rueda et al., 2004; Simmering, 2016), and therefore may be a population in which to test the applicability of load theory in a wider array of contexts. In this vein, Huang-Pollock, Carr, and Nigg (2002) provided an early test of the perceptual load theory of selective attention as it applies to cognitive development. By comparing young adults' performance to the performance of children of various ages (7–11 years old) on the visual search task described above, they found that children's response times were slower than adults’ overall, as well as being disproportionately slower in the presence of salient distractors when set size was low. Such results could be viewed as being broadly consistent with the idea that children have reduced selectivity in their allocation of attention. When set size was high, the effect of distractors on reaction time was essentially identical across all ages of children and adults. The apparent lack of developmental differences suggests that, at the high set sizes utilized, attentional capacity was exhausted in both adults and children. Adding further nuance still, error rate differences associated with distractor compatibility decreased with set size. In other words, more errors were made on incompatible irrelevant shape trials than on compatible irrelevant shape trials, but this difference was diminished at higher set sizes (a finding that would be broadly well-matched with load theory). But unlike with reaction time, the magnitude of this decrease in the difference in error rate between compatible and incompatible trials did not reliably decrease with increasing age (i.e., the interaction between set size and age that was observed in response time was much less pronounced with accuracy). In all, these findings show that distractor interference in children was similar to that of adults at high target loads while being much greater than adults at low target loads (for convergent evidence of parallels between adults and children, using event-related potentials responsive to distractors, see Couperus, 2011).

Here we extended work on attentional development and load theory beyond the prototypical visual search paradigm, using two tasks that each incorporated capacity and selection dimensions of visual attention. We adapted one of these tasks from one-shot change detection, a common test of visuospatial short-term memory (Luck & Vogel, 1997; Machizawa & Driver, 2011; Rouder et al., 2011). In this task participants briefly saw a display of items followed by a delay period. After the delay period, a second display of items was presented and participants were asked to identify whether any of the items in the second display were different to items in the first display. To the degree that visuospatial short-term memory relies on attention (Chun, 2011; Cowan, Fristoe, et al., 2006a; Shipstead et al., 2014), change detection with distracting stimuli should produce patterns of performance that are analogous to those of the visual search paradigm of Lavie (1995). Specifically, increases in the load associated with targets as well as increases in the load associated with distractors should each produce diminished performance. However, increases in the load associated with targets should decrease the negative effects of increases in the load associated with distractor information (i.e., there should be an interaction between the two dimensions; the greater the amount of capacity that has to be devoted to target information, the less that should be unallocated and therefore spill over to distractor information).

As a first test of our experimental paradigm’s ability to detect both capacity-related and selection-related changes in performance, we tested adults on two versions of one-shot change detection with varying numbers of targets and distractors (one version required selecting relevant items by color and detecting shape changes, the other version required selecting relevant items by shape and detecting color changes). For example, in the shape-change task version, participants saw an array of shapes and were instructed to attend to only red items while ignoring items of other colors. The array was removed for a delay period, after which another array was presented. One of the targets may have changed in shape or may not have, while all distractors remained the same. Participants were expected to have a greater difficulty detecting shape changes with larger numbers of red items than smaller numbers of red items, and likewise for the number of distractors. We hypothesized that change detection performance should also be affected by an interaction between target and distractor quantities, as predicted by perceptual load theory.

We also sought to further qualify these effects by testing the same participants in an attention task using identical stimuli and configurations to the shape-based selection task above. To this end we created a novel enumeration task with categorically distinct distracting items while using the same stimuli (and the same filtering dimension) as the shape-based selection change detection. The novel visual enumeration paradigm (Miller & Baker, 1968) combined capacity and selection (i.e., target-load and distractor-suppression) dimensions while requiring discrete parametric responses (in contrast to the strictly two-alternative forced-choice paradigm of the change-detection task, or of Huang-Pollock, Carr, and Nigg, 2002). The use of many possible responses was strongly motivated by the results of Marciano and Yeshurun (2017), whose results suggested the possible utility of paradigms that would allow errors to be treated parametrically rather than as binary categories (i.e., where amount of error is informative). By requiring an integer response in estimating the number of briefly presented targets amidst simultaneously presented distractors, we were able to estimate not only the presence of errors, but the magnitude, and thus the parametric increases in processing error due to the increased numbers of distractors (i.e., increased selection load) and increased numbers of targets (i.e., increased capacity load).

Additionally, we note that our paradigm differs from previous enumeration paradigms using distractors. For instance, in the work by Trick and Pylyshyn (1993) examining enumeration with distractors, the stimuli remained present until a response was made, and participants were instructed to count the targets as fast as possible. It was thus the case that errors were not informative in the way that they are in the current design.

As is true in the change-detection paradigm, perceptual load theory predicts that in this enumeration paradigm, both selection and capacity demands would be associated with costs to task performance, meaning that increased capacity load or increased number of distractors would increase enumeration estimation error. However, the theory also predicts that these dimensions would interact. While distractor loads should increase enumeration error when capacity loads are small, large capacity loads should be associated with minimal distractor processing. In this case, large-capacity-load trials would have lower distractor-load effects than small capacity-load trials. While null interactions may be difficult to interpret due to the potential for idiosyncratic effects of specific tasks (i.e., it is theoretically possible that particular details of task instantiation preclude load theory-predicted outcomes to arise), at a minimum such null interactions would fail to support load theory. In contrast, the presence of the predicted effects would provide positive evidence for the generality of load theory.

Experiment 1: Adults

In Experiment 1 we tested the generality of load theory in two novel change-detection paradigm versions as well as in an enumeration paradigm. Each of these tasks allowed us to address one of our key questions (i.e., whether effects predicted by load theory would be seen in new tasks). In addition, reliable effects in these tasks would allow us to test the within-participant consistency of load effects across task contexts. Results from Experiment 1 also guided our choice of methods in Experiment 2 with children (i.e., testing the generality of load theory across populations).

Method

All procedures were approved by the research ethics board at the University of Wisconsin-Madison.

Participants

Thirty-two young adults (mean age=19.4 years, SD = .55, range = 18.6–20.7; ten female) participated for course credit at the University of Wisconsin-Madison. Participants were recruited from Introduction to Psychology courses using an online participant management system. We did not collect demographic data, but the course population is primarily White, with the largest minority group comprising students of East Asian descent, and is generally middle to upper-middle class. We further note that in this and subsequent studies, although we did not explicitly test for color vision deficiencies, (1) our stimuli were not designed to be perceptually isoluminant and (2) participants completed practice in the presence of an investigator to ensure comprehension, at which point an inability to differentiate colors should have become apparent.

Apparatus

All tasks were presented on 22-in. Dell monitors using Psychtoolbox (Brainard, 1997; Kleiner et al., 2007) in MATLAB, running on Dell Optiplex computers.

Procedure

Change detection

In a standard change-detection task, a set of items is briefly presented, removed, and then another set of items is presented. The first and second sets of items may be identical, or else non-identical due to the change of one item. The participant’s goal is to correctly identify whether the first and second displays were identical or non-identical (Luck & Vogel, 1997; Rouder et al., 2011). While this task is typically referred to as a test of visuo-spatial short-term memory, attentional processing is central to many theories of working and short-term memory. In accordance with this, attention to display items commonly plays a major role in models of change-detection task performance (Cowan, 1995; Rouder et al., 2011). Additionally, unlike in the standard change-detection task, where all items are potentially relevant (i.e., any item has the potential to be deleted in the second display), in our version, some sub-set of items that were explicitly irrelevant were also included in the displays (see Fig. 1A). This in turn should further implicate attentional processes in task performance.

Fig. 1
figure 1

Example trials from the two tasks. In the change-detection memory task (A) participants saw an array of targets [fish] amidst distractors, and reported whether any of the target colors changed after a short delay. In the enumeration task (B) participants briefly saw sets of targets [fish] amidst distractors, and subsequently reported the numbers of targets present in the display

A 250-ms auditory cue at 800 Hz signaled the start of a trial. A delay followed, with a duration randomly chosen between 250 ms and 750 ms. Next the first display was presented for 250 ms. This first display consisted of two, four, or six target items and zero, two, four, or six distractor items. Stimuli were presented in an invisible 5-by-5 item grid with location centers spaced 1.5° apart. Filled locations were chosen at random. Each stimulus was approximately 1.8° in height and width. All targets and distractors were drawn from nameably different colors (see Supplemental Material, Fig. S3).

Two versions of the task were utilized – a color-change version and a shape-change version (with the names alluding to the stimulus dimension along which items could change for the given task versions). For the color-change version, the targets were fish and the distractors were circles (within trials, drawn in non-overlapping color sets). After the first display disappeared, a blank screen was presented for 1,000 ms. The second display was then presented until participant response. This display was identical to the first display on 50% of trials, while on the other 50% of trials the color of one of the targets changed. The participants’ task was to indicate whether the first and second display were identical or different. No explicit feedback was given to the participants regarding the accuracy of their response. After response, there was a 500-ms delay before the next trial started. The shape-change version, using shapes from Wheeler and Treisman (2002), worked in an analogous fashion except that the targets were red shapes and the distractors were non-red shapes. Thus, in 50% of trials, in the second display the shape of one of the targets changed. Within a task version (shape/color), the combinations of target number and distractor number were presented in a pseudorandom fashion (i.e., equal numbers of all combinations were represented) for a total of 192 trials overall. These trials were preceded by eight practice trials, of small set sizes, identical to the regular task. In all, each memory task took approximately 15 min. These two tasks were counterbalanced to occur either before or after the enumeration task.

Enumeration

All stimuli were identical to those used in the change detection color-change task described above with two key exceptions: (1) after the first display disappeared, no probe screen appeared. Instead, a blank screen was presented and participants were asked to make an enumeration response; and (2) different numbers of possible targets (110) and distractors (3 or 7) were utilized (see Fig. 1B). Participants were asked to indicate the number of targets they believed were present using the numeric keys above the letters on a standard keyboard (with stickers on the “0,” “-,” and “=” keys to indicate that they should be pressed for responses of 10, 11, or 12, respectively). Participants were not told what the range of possible targets was (although the possible responses limited the range from 1 to 12) nor were they given feedback regarding the accuracy of their responses. The combinations of target number and distractor number were presented in a pseudorandom fashion (i.e., equal numbers of all combinations were displayed 11 times each) for a total of 220 trials overall. These trials were preceded by eight practice trials, of small set sizes, identical to the regular task.

Results

Analysis

The effects predicted by load theory would be evident in cross-participant patterns of within-participant changes in performance due to changes in stimulus number across target or distractor dimensions. Linear mixed-effects models, in which group-level coefficients are estimated in parallel with participant-level coefficients in a hierarchical structure, are ideally suited to the estimation of such stimulus-dependent changes. Errors were first averaged for every combination of participant, target number, and distractor number, then analyzed in multilevel linear models predicting the amount of error given the target number and distractor number on each trial, while accounting for participant-level variation. In this analysis, main effects indicate differences in error due to distractor number or target number differences, while holding each of the other predictors constant. Interactions, meanwhile, indicate that the effect of one predictor on error changes as the level of another predictor changes (e.g., a reliable negative target-distractor interaction would support the prediction of load theory). All analyses were conducted in R, with linear multilevel models fit with lme4 package (Bates et al., 2015, p. 4). Random effects were specified using a maximal structure (Barr et al., 2013). Null hypothesis tests were implemented with the Kenward-Roger approximation using the R package pbkrtest (Halekoh & Højsgaard, 2014).

Averaging data cells results in 20 cases for each participant in enumeration and 12 cases in change detection. Absolute errors were used in enumeration (e.g., a response of three with five targets present would be an error magnitude of 2, as would a response of 7). The largest overestimate for a set size 10 was therefore 2, or a response of 12. This did not appear to bias parametric estimates of error increases by set size; see Supplementary Material for analyses of a truncated subset of data that allowed larger errors. Binary errors were used in change detection (i.e., percent incorrect trials). We note that alternative error functions (e.g., A’ for change detection or squared error for enumeration), did not produce qualitatively different results.

Response times (RTs) were log-transformed before analysis to ensure normality of residuals (i.e., because RT distributions are approximately log-normally distributed; Huang, Mo, & Li, 2012; Limpert & Stahel, 2011). Participants were not instructed to respond quickly, so RT variation is simply a reflection of the incidental differences in completing the tasks under various conditions. Accuracy, not RT, was the primary measure of interest.

Change detection – combined tasks – adults

We first tested for the overall presence of load effects in the change-detection tasks (see Fig. S1 for data per combination of factors). We combined the data from both versions of the task and estimated the effects of target number, distractor number, and their interaction. Each of these effects was allowed to vary by (i.e., interact with) task type, with the two task types being coded as -0.5 and 0.5 such that lower-order effects not including task type would be estimated at an intermediate level between the two task types. In this model the only significant effect independent of task type was that of target number (b=.057, F(1,206.0)=209.8, p<.001). The effects of distractor number and the target × distractor interaction were not significant (both F<2.4, p>0.1). Of possible interactions with task type, only the interaction with target was significant (b=-0.024, F(1,627.8)=10.6, p=.001), with other ps>0.45.

Change detection – color change – adults

The most standard version of array change detection is that of color changes (Cowan, Naveh-Benjamin, Kilb, & Saults, 2006b; Luck & Vogel, 1997; Simmering, 2012). When testing the results of only this version of our tasks we found a generally similar lack of support for load theory as when testing both versions in a single statistical model.Footnote 1 In a multilevel model predicting percent errors with number of targets, number of distractors, and their interaction, the effect of target was significant (b=.044, F(1,77.1)=62.51, p<.001). The effects of distractors and the interaction between targets and distractors each did not significantly predict errors (both ps>.1). Similarly, when predicting log RT, only target was reliable (b=.03, F(1,117.9)=11.88, p<.001; other effects p>.3)

Enumeration filtering – adults

The two versions of the change-detection paradigm demonstrated a lack of interaction between target number and distractor number that would be predicted by load theory. We next tested the predictions of this theory in the enumeration filtering task (see Fig. S2 in the Supplemental Material for data per combination of factors). In multilevel models using the same predictors as above, the results of this task mirrored those of the change-detection tasks. Only target number reliably predicted the absolute error of responses (b=.137, F(1, 168.7)=1296.4, p<.001), with all other p>.1. A contrasting conclusion is reached when predicting log RT using target number, distractor number, and their interaction, while controlling for participant-level intercepts and slopes. In this case all three fixed effects are reliable (target number: b=.108, F(1, 62.5)=118.3, p<.001; distractor number: b=.019, F(1, 280.3)=6.2, p=.013; interaction: b= -.003, F(1,475.0)=7.2, p=.007). This indicates that performance decreases, in the form of a RT increase, with increasing distractor number and with increasing target number. However, the negative estimate for the interaction term indicates that the distractor effect is diminished at high numbers of targets; this is the prediction of load theory. The results are qualitatively similar when the multilevel model is run with raw RTs, although the distractor main effect is decreased somewhat.

Experiment 1 – Discussion

In two attention-demanding memory tasks and one novel attention task, adults’ performance was only partially congruent with the core predictions of load theory. Participants’ ability to accurately complete the tasks was not reliably affected by the number of distractors in any of the three paradigms, nor did the impact of distractors interact with the number of targets. Yet, evidence for the interaction between distractor effect and target load predicted by load theory was evident in the RTs of the enumeration task. One possible reason for the somewhat equivocal results from Experiment 1 was that the dimensions used in these tasks may not be extreme enough to detect perceptual load effects in adults’ accuracy (Eltiti, Wallace, & Fox, 2005; see the question regarding the appropriateness of participant population in the Introduction). That is, in adults with high ability levels, the changes in performance due to increases in the number of distractors or targets is relatively small, and under these circumstances an interaction between the target effect and the distractor effect may be difficult to detect. In particular, distractor-dependent changes in performance, which were not uniformly observed, are a necessary precursor for the predicted interaction.

Experiment 2

The core effect of distractor interference should be more detectable in individuals who are more substantively affected by capacity and selection loads. In Experiment 2 we thus examined performance in a cohort of children.

Method

All procedures were approved by the research ethics board at the University of Wisconsin-Madison.

Participants

Children (n=28; mean age =7.92 years, SD=0.922, range = 7.1–9.5; gender not recorded) were recruited from the Madison community using a UW-Madison family-volunteer database. Demographics were not collected, but the community from which we sampled is primarily middle to upper-middle class, non-Hispanic, White, and monolingual English-speaking. The children participating in Experiment 2 chose from a selection of small toys or books as compensation.

Apparatus, procedure, and analysis

The remaining details of the method follow Experiment 1 with a few modifications to be appropriate for children. Due to the lack of distractor effects in the change-detection task in Experiment 1, in Experiment 2 we sought to simply detect and quantify the presence of distractor effects in children’s change-detection performance. That is, we wanted to test our assumption that children were indeed more susceptible to distractor effects than adults (i.e., test the broad question regarding the appropriateness of load theory as applied to different populations). However, due to practical time constraints, we omitted target manipulations from change detection in children. Only color changes were tested (targets vs. distractors indicated by their shape), with the target number held constant (4) and distractor number varied (0, 2, or 5), with 60 trials total.

The same procedure was used as in Experiment 1 for enumeration, but with fewer (126 total) and truncated set sizes (one to seven targets; zero, three, or six distractors; six trials of each combination).

Tasks were completed in one session, in a counterbalanced order with a break in between, which took about 45 min. Analyses proceeded identically to Experiment 1 insofar as the tasks overlapped between the two experiments.

Results

Change detection – color change – children

In a multilevel model predicting error percentage using distractor number, while controlling for participant-level intercepts and slopes, the effect of distractor number was significant (b=.018, F(1,26.0)=9.4, p=.005). This provided evidence that children were susceptible to decreases in accuracy due to increased numbers of distractors, unlike adults (see Fig. S1 in the Supplemental Material for data per combination of factors). This was in contrast, however, to the non-significant changes in RT (b=.012, F(1,55)=2.1, p=.149).

Enumeration filtering – children

We fit a multilevel model predicting enumeration absolute error using distractor number, target number, and the interaction between them, while controlling for participant-level intercepts and slopes. In this model all three fixed effects were reliable (target number: b=.197, F(1, 181.6)=60.1, p<.001; distractor number: b=.106, F(1, 256.0)=12.7, p<.001; interaction: b = -.013, F(1, 256.0)=4.25, p=.040). As in change detection, errors in the enumeration task were increased by the number of distractors. This distractor effect was in turn moderated by the number of targets (see Fig. S2 in the Supplemental Material for data per combination of factors).

The same effect was evident when predicting log-transformed RT using target number, distractor number, their interaction, and controlling for absolute error and participant-level intercepts and slopes. The effects of target number (b=.145, F(1, 82.6)=77.9, p<.001), distractor number (b=.082, F(1, 257.9)=27.6, p<.001), and their interaction (b = -.013, F(1, 242.0)=17.2, p<.001) were each significant.

Experiment 2 – Discussion

In both change detection and enumeration, children were sensitive to the presence of distractors where adults had not been. Additionally, the predictions of load theory were borne out in children’s enumeration errors as well as RTs. Each of these results indicates stronger support for perceptual load theory in children than adults.

Additional analyses across experiments

Each of the previous analyses were intended to address the questions related to the presence of load effects in different tasks and different populations. These analyses treated each task and population as independent. However, direct comparisons of the populations and tasks may also be informative. Comparisons of children’s and adults’ behaviors in a single statistical model provides for direct estimations of age-independent [main effects] and age-dependent task effects [interactions]. Further, using the random-effects structures estimated with these statistical models, we can test whether participants’ behaviors are related across tasks. These two analyses allow us to answer questions regarding appropriate participant populations as well as individual differences in load effects.

Child/adult

Children had more errors overall than adults in both the enumeration task (mean difference on matched targets .227; see Fig. S2, Supplemental Material) and the change-detection task (mean difference on matched targets 12.2%; see Fig. S1, Supplemental Material). We contrasted children and adults on the degree to which their data matched the predictions of perceptual load theory. Previous work has shown that throughout child development the effect of perceptual load decreases, and we would therefore expect that children would demonstrate a more pronounced decrease in distractor effect associated with increases in relevant target stimuli (i.e., children’s data will conform to the predictions of perceptual load theory in a more pronounced way than adults’ data).

Child-adult comparisons of change detection

In the change-detection task requiring participants to detect the change in color of fish cartoons amidst colored circles, we tested adults and children in the same linear mixed-effects model (see Fig. 2). Main effects were significant for age (b= -.104, F(1,86.6)=10.8, p=.002) and distractor number (b=.017, F(1, 336.7)=12.8, p<.001), but not their interaction (p=.164), when controlling for target number and individual variance. Because target number did not vary in children in this task, we could not include interactions with target numbers. This model indicates that children globally perform worse on this task, and both children and adults perform worse as the number of distractors increases. Because there is no interaction between age and distractor effect, this model does not support an understanding of attentional selection maturing from middle childhood into adulthood. However, this finding is difficult to interpret. Baseline differences in performance could be due to confounding factors (e.g., increased attentional lapses in children, improved familiarity with similar stimuli by adulthood). Without including multiple levels of targets and distractors, and statistically modeling and partialling out their contributions to performance, it is impossible in this analysis to de-conflate baseline performance on the task (i.e., an intercept) with a target capacity effect (i.e., a slope), and thus the developmental differences in this task are difficult to interpret in terms of theoretical import. The statistical disambiguation allowed by modeling both slopes and intercepts is demonstrated in the following task.

Fig. 2
figure 2

Fit values of change detection errors. Linear mixed-effects model fits relating the presence of errors in response to varying stimuli. Shaded bands indicate 95% confidence intervals (CIs). Target number did not vary in the children’s task. Distractor range was chosen to demonstrate full model fits; certain combinations of distractors and age were not specifically tested. See Fig. S1 in the Supplemental Material for means and CIs of raw data

Child-adult comparisons of enumeration

As expected, adults made fewer overall errors than children in both change-detection and enumeration tasks (which could be considered an intercept, or baseline difference). However, this effect is statistically overshadowed by the predictive power of target and distractor load. Children's errors increase faster than adults with the addition of task-relevant targets as well as the addition of task-irrelevant distractors. In a multilevel linear regression model predicting enumeration error while accounting for individual variance, significant effects were present for target number (b=.197, F(1,682.7, p<.001), distractor number (b=.106, F(1,774.9)=14.0, p<.001), target-distractor interaction (i.e., the prediction of load theory; b= -.013, F(1,731.1)=4.5,p=.034), target-age interaction (i.e., children have larger target effects than adults; b= -.060, F(1,524.1)=4.9, p=.026), distractor-age interaction (i.e., children have larger distractor effects than adults; b= -.089, F(1,753.4)=8.1, p=.005), and the three-way interaction between targets, distractors, and age (b=.013, F(1,731.1)=4.3, p=.039). In fact, in this model, the main effect of age was the only non-significant predictor (p>.2), indicating that baseline differences unrelated to distractor or target load (e.g., small-set-size errors due to children's lower vigilance) was not predictive of error magnitude. See Fig. 3 for predicted values for different age, distractor, and target levels.

Fig. 3
figure 3

Fit values of estimation error. Linear mixed-effects model fits relating the presence of errors in response to varying stimuli. Shaded bands indicate 95% confidence intervals (CIs). Target and distractor ranges were chosen to demonstrate full model fits; certain combinations of targets, distractors, and age were not specifically tested. See Fig. S2 in the Supplemental Material for means and CIs of raw data

The three-way interaction term is evidence for the presence of an interaction between capacity and selection dimensions of attention in children but not in adults. In this task children, but not adults, exhibit patterns of results that would be predicted by perceptual load theory (Lavie et al., 2004), wherein the inclusion of distractors decreases performance when target load is low but not when target load is high.

RTs in the enumeration task followed the same pattern as the estimation error. In a multilevel model predicting log enumeration RT using target number, distractor number, age, all interactions, and controlling for individual-level variance, only interaction between age and target number was not reliable (b= -.036, F(1,183.6)=3.51, p=.063). The effects of target number (b=.145, F(1,292.9)=83.7, p<.001), distractor number (b=.082, F(1,770.5)=24.7, p<.001), age (b=-.29, F(1,133.4)=8.2, p=.005), target-distractor interaction (b= -.013, F(1,728.2)=15.2, p<.001), distractor-age interaction (b= -.063, F(1,733.0)=12.2, p<.001), and the three-way interaction between targets, distractors, and age (b=.010, F(1,728.2)=7.9, p=.005) were each significant. RT results are also not qualitatively different when controlling for errors (i.e., if assuming that the degree of error independently influences RT).

Cross-task correlations

We found that each of the tasks was sensitive to performance differences due to age and certain aspects of stimulus number. However, the adults’ data in particular did not confirm the predictions of load theory. To explore this further, we next tested cross-task individual differences in distractor effects (in both age groups) and target effects (in adults). A high correlation between selection effects between both tasks (or capacity effects between both tasks) would provide compelling evidence for a controlled attention basis for change-detection task performance; a low correlation would mean that differences in task demands cause performance scores to rely on different central processes.

We used the random-effects slopes calculated in the preceding multilevel models to test the product-moment correlations between the two tasks' measurements of target capacity and distractor filtering. None of these correlations were significant. Distractor effect correlations were negligible for both children (r(26)=-.15, p>.4, 95% CI=[-.50, .24]) and adults (r(32)=.04, p>.8, 95% CI=[-.345, .413]), while target correlations for adults were larger but still non-significant given the sample size (r(26)=.30, p>.1, 95% CI=[-.08, .61]). This analysis is much less powerful than the mixed models reported above and a larger sample size would be necessary to claim support for a complete lack of correlation between measures. However, it is clear that the two tasks are not measuring identical attentional abilities, as we would expect a large amount of shared variance if they were (e.g., R2 > .5). Instead, even the largest value in any of the confidence intervals (i.e., .61) would be associated with a R2=.37. If the true R2 were indeed 0.5, we would have statistical power over .95 to detect this effect, suggesting that we have sufficient power to convincingly reject the hypothesis that the two tasks’ target and/or distractor dimensions are isomorphic. Further work would be needed to identify the shared dimensionalities of the two tasks.

General discussion

The overarching predictions of load theory (Lavie et al., 2004; Lavie & Tsal, 1994) were tested in experiments using working memory and enumeration paradigms with children and with adults. The use of multiple tasks within participants, as well as multiple populations, allowed for an extension of our understanding of the conditions under which target and distracting information interacts in attention. Load theory would predict that, while increased distracting information does worsen performance, there is an interaction with target number such that an increased number of targets diminishes the effect of distractors. Adults’ accuracy on change detection and enumeration paradigms was negatively affected by increasing numbers of targets, but there was no reliable change in performance due to distractor number. A lack of distractor effect is unlikely to be due to (as load theory would predict) adults’ attention being so utilized by targets that distractors had no effect. Quite to the contrary, it is much more plausible that the distractors utilized were simply not distracting enough to effectively decrease adults’ accuracy. This interpretation is consistent with adults’ patterns of enumeration RTs, which were systematically affected by both distractor number and target number, due to the greater sensitivity of RTs as a measure.

In contrast to adults, children’s response times and accuracies were affected by both distractor number and target number, and their performance followed the predictions of load theory directly. This led to there being significant age-related differences in the effects of stimulus number on performance. In fact, in the enumeration paradigm, the only non-significant predictor of accuracy was age. This is despite the fact that adults clearly performed better than children. Target number and distractor number, as well as the various interaction terms in the model, were sufficient to suppress the statistical effect of age, indicating that our paradigm accounted for the meaningful age-related variation in selective attention.

Adults uniformly performed better than children. Indeed, this is not surprising; high-functioning young adults are the typical population for psychological research, and as such they act as the baseline group against which age-related or psychopathology-related differences are assessed. In our results, the differences between children and adults were fully explained by the manipulation of targets and distractors in an attention task. It is all too easy, when making a comparison between groups (e.g., ages) to reify group-level differences. For instance, in this study, it would be simple to dismiss the patterns of performance that distinguish adults and children as being qualitatively distinct. However, by implementing statistical models that effectively suppress these age-related differences, we can identify processes as candidates for loci of developmental change. Here, in the enumeration task, we have seen that adults and children differ not just in the degree to which their performance is hindered by increasing target or distractor information, but the ages also differ in the degree to which their performance demonstrates the perceptual load effect.

In fact, as with previous developmental comparisons of the perceptual load effect (Huang-Pollock et al., 2002, Study 1), adults differed from children in that the older group’s RTs were influenced by distractors, while their accuracies were not. This RT difference was not observed in previous work using enumeration with distractors (based on visual inspection of the results of Trick & Pylyshyn, 1993; see especially Experiment 4). However, in stark contrast to Huang-Pollock et al. (2002; Study 1), in our paradigms children’s and adults’ accuracy diverged at high target or distractor loads, rather than starting differently on easy trials and converging on difficult trials. This indicates that age-related variation in performance was not systematically due to an unmeasured variable (e.g., effort or vigilance).

We have shown that, when comparing adult attention to that of children aged 7 and 8 years, load theory explains patterns of intra-individual behavior. That is, we have shown that children are more negatively affected by targets and by distractors than adults. In addition, children clearly demonstrated a larger perceptual load effect than adults. Theoretically important differences were most directly evident in the significant interaction between age and perceptual load effect.

The current extension of load theory to two novel tasks addresses some criticisms of the theory. Far from being non-robust and bound to a specific experimental paradigm, we have shown that children demonstrate load-dependent decreases in distractor interference in enumeration. In addition, there is some indication that the selective attention basis of working memory tasks may also make performance on these tasks susceptible to load effects, but the evidence here is inconclusive. In both of these cases it is clear that children, who are generally more susceptible to distractor interference, demonstrate more robust distractor-related performance decrements than adults. These distractor effects are a necessary component of load theory, and tasks for high-functioning adults may have difficulty demonstrating load effects if they do not include sufficiently effective distractors.

Despite the presence of some explanatory power of the perceptual load effect in our novel experimental paradigms, there was a surprising lack of relations between performances on our two tasks. Many individual-differences studies have found evidence for individual-level covariation between working memory and selective attention (Machizawa & Driver, 2011; Miyake et al., 2001; Unsworth & Spillers, 2010). This has been interpreted as evidence for a common underlying processing ability. Indeed, theories of working memory emphasize its attention-based nature (Cowan, 1995; Kane et al., 2001; Postle, 2015; Shipstead et al., 2014). In contrast to these theoretical predictions that components of performance (i.e., due to target loads or distractor loads) should be shared across the tasks in our experiments, we found no evidence for a common processing ability. Although the lack of cross-task correlation mirrors the lack of intra-individual reliability found by Marciano and Yeshurun (2017), the null result is challenging to interpret. The patterns of performance in the enumeration task were likely reliable enough to detect individual differences, as each model coefficient had a median split-half correlation of over .4.Footnote 2 However, the effects of targets, distractors, and their interactions are clearly not equivalent within individuals and across tasks.

One possible reason for a lack of participant-level variation across tasks would be a dissociation between distractor effects. Although theories of working memory emphasize links between attention to and maintenance of information, it is possible that sources of load differ across timescales of processing (i.e., memory load as opposed to perceptual load; Lavie et al., 2004). In this case the variations in stimulus number across the two tasks presented here may have loaded on distinct processes. While this was not our expectation, given the attention-based nature of selection in both tasks, the possibility cannot be ruled out given our data.

As noted in the Introduction, our tasks were just two of a theoretically infinite number of tasks that could be designed to test the predictions of load theory. Our argument is thus not that these “answer” any particular question or confirm or falsify load theory. Instead we suggest that, by utilizing a broader set of tasks and outcome measures, it will allow for refinement of load theory by identifying both places where the theory is broadly consistent with human behavior as well as potential areas where behavior differs from the predictions of load theory. For example, while we tested one dimension of participant-level variability (age-groups), there is a host of other potential individual difference measures that might be predictive of behavior that does or does not correspond to load theory (e.g., measures of ADHD or trait anxiety; Murphy et al., 2016). Future work may benefit from exploring the relations in load effects across tasks in the context of individual difference factors that might predict attentionally linked attributes. Furthermore, although we tested both children and adults, all participants were drawn from largely White and middle-class populations with access to the University, thereby potentially truncating the variance in our inter-individual measures.

A second broad vein of future work could focus on factors inherent in the task. Indeed, just within our basic tasks, there is a potentially rich set of alternative versions (e.g., switching targets and distractors; increasing the distracting nature of distractors; etc.) that could speak to main ideas in load theory. Finally, another interesting future direction is to examine how predictions of load theory interact with participant learning. For example, in our task low error rates on easy trials as well as participants’ verbal indication of understanding indicated their ability to complete tasks without feedback. If feedback had instead been presented, it is possible that participants may have learned and further improved their performance over the course of a given task. In fact, learning of stimulus features and task structures (e.g., color sets, timing) may have influenced participants’ performance even without feedback. Questions of learning and feedback influencing load effects within attention and memory are beyond the scope of the current work, but individual differences and developmental differences imply the possibility for learning-related changes in selective attention as well.

Finally, we emphasize that our intent with experimental design and analyses was to speak to load theory specifically. However, there are other theories of attention (i.e., beyond load theory) that may make (in some cases alternative) predictions regarding expected behavior given our task manipulations (e.g., Biggs & Gibson, 2010; Bundesen et al., 2005; Torralbo et al., 2010). For instance, some theories of attention posit a beneficial role of attention to distractors (e.g., Makovski, 2019), which could lead to the opposite prediction to that of the current work. We believe that our reported results should be sufficiently transparent that they may also be used to inform other theories as readers may see fit.

Conclusions

In experiments with adults and with children, we demonstrated the presence and the limitations of perceptual load theory. While developmental change was explained by this theory, it was apparent that adults were not challenged enough by our novel tasks for their accuracy to be systematically affected by distracting information. In contrast, children’s performance (as measured by both RT and accuracy) was worse in the presence of many distractors when compared to few distractors, which then allowed for the moderation of this effect by target number to be evident. Target selection (i.e., distractor suppression) and accuracy each improve from childhood to adulthood, and these improvements attenuate the interactions predicted by load theory.