Concurrent guidance of attention by multiple working memory items: Behavioral and computational evidence

Zhou, Cherie; Lorist, Monicque M.; Mathôt, Sebastiaan

doi:10.3758/s13414-020-02048-5

Concurrent guidance of attention by multiple working memory items: Behavioral and computational evidence

Open access
Published: 11 May 2020

Volume 82, pages 2950–2962, (2020)
Cite this article

Download PDF

You have full access to this open access article

Attention, Perception, & Psychophysics Aims and scope Submit manuscript

Concurrent guidance of attention by multiple working memory items: Behavioral and computational evidence

Download PDF

2208 Accesses
10 Citations
7 Altmetric
Explore all metrics

Abstract

During visual search, task-relevant representations in visual working memory (VWM), known as attentional templates, are assumed to guide attention. A current debate concerns whether only one (Single-Item-Template hypothesis; SIT) or multiple (Multiple-Item-Template hypothesis; MIT) items can serve as attentional templates simultaneously. The current study was designed to test these two hypotheses. Participants memorized two colors, prior to a visual-search task in which the target and the distractor could match or not match the colors held in VWM. Robust attentional guidance was observed when one of the memory colors was presented as the target (reduced response times (RTs) on target-match trials) or the distractor (increased RTs on distractor-match trials). We constructed two drift-diffusion models that implemented the MIT and SIT hypotheses, which are similar in their predictions about overall RTs, but differ in their predictions about RTs on individual trials. Critically, simulated RT distributions and error rates revealed a better match of the MIT hypothesis to the observed data than the SIT hypothesis. Taken together, our findings provide behavioral and computational evidence for the concurrent guidance of attention by multiple items in VWM.

Multiple representations in visual working memory can simultaneously guide attention

Article 18 June 2022

Lingxia Fan, Liuting Diao, … Xuemin Zhang

Visual working memory simultaneously guides facilitation and inhibition during visual search

Article 07 April 2016

Blaire Dube, April Basciano, … Naseem Al-Aidroos

Allocation of resources in working memory: Theoretical and empirical implications for visual search

Article Open access 17 March 2021

Stanislas Huynh Cong & Dirk Kerzel

Introduction

Internal representations of task-relevant information, or attentional templates, stored in visual working memory (VWM) guide attention in visual search (Bundesen, 1990; Bundesen et al., 2005). For example, when you are looking for a chocolate cake, all dark items in a bakery will be more likely to draw your attention. The biased-competition framework (Desimone, 1998) states that VWM leads to pre-activation of memorized features in visual cortex. In this example, when you keep the color of a chocolate cake in VWM, neurons in color-selective areas that represent this color become pre-activated. And later, when the color is actually perceived, this pre-activation leads to an enhanced neural response, which at the behavioral level results in attention being drawn towards chocolate-cake-like objects. In other words, VWM contents guide attention towards memory-matching items in a top-down manner to optimize visual search (Chelazzi, Miller, Duncan, & Desimone, 1993; Chelazzi, Duncan, Miller, & Desimone, 1998).

Although multiple representations can be maintained in VWM simultaneously, there is ongoing debate about the number of VWM items that can simultaneously serve as attentional templates. The Single-Item-Template hypothesis (SIT; Houtkamp & Roelfsema, 2006; Olivers, Peters, Houtkamp, & Roelfsema, 2011) proposes a functional division within VWM: While one item actively interacts with visual processing to guide attentional selection towards matching items, other items are shielded from visual sensory input, and thus cannot guide attention.

Studies demonstrating a switch cost between templates are often interpreted as evidence for the SIT model. In a study by Dombrowe, Donk, and Olivers (2011), participants made a sequence of two eye movements towards two spatially separated target items that were indicated by arrows. In the switch condition, the two targets had different colors, and thus required a switch between two templates; in the no-switch condition, both targets had the same color, and thus required only one attentional template. Crucially, eye movements that were correctly aimed at the second target were delayed by about 250–300 ms in the switch condition compared to the no-switch condition. This cost associated with switching between templates is in line with the SIT hypothesis, suggesting that only one template can be active at one time.

In contrast to the SIT hypothesis, the Multiple-Item-Template (MIT) hypothesis suggests that multiple VWM items can guide attention simultaneously (Beck et al., 2012), although holding multiple items in VWM would reduce the memory quality of each item, thus reducing memory-driven guidance (Bays & Husain, 2008; Kristjánsson & Kristjánsson, 2018). As Kristjánsson et al. (2018) point out, even if multiple VWM items can guide attention simultaneously, this does not mean that they always do; specifically, they propose that multiple VWM items guide attention at the same time only when this is needed for the task. The MIT hypothesis builds on research suggesting that there is no unitary spotlight of attention, but rather that attention can be divided (Eimer & Grubert, 2014) – in this case, across multiple memory-matching items.

Recent work by Beck and Hollingworth (2017) supported the MIT hypothesis. In their experiment (a saccadic sequential search task), participants first saw a cue that consisted of two colors (e.g., red and blue), followed by two pairs of colored objects, presented one pair at a time. The first pair always contained one non-matching distractor (e.g., yellow) and one object that matched one of the cued colors (e.g., red); participants fixated this cue-matching object. In the second pair, the cue-matching object from the first pair was presented either with a new non-matching distractor (e.g., green) or with an object that matched the remaining cued color (blue). In the latter case, participants were free to select either object. Critically, when participants were free to select either the first- or the second-cued color in the second pair, the selection probability of the first cued color was substantially reduced: They were about as likely to first select red and then blue, as they were to select red twice. In other words, even though participants presumably had an active search template for the first-cued color, the second-cued color was able to compete with it. This competition between the two cue-matching objects suggests that both templates were maintained in an active state in VWM.

However, when looking at behavioral evidence comparing the SIT and MIT hypothesis (e.g., (Hollingworth & Beck, 2016; van Moorselaar et al., 2014), it is difficult to distinguish between the two hypotheses by only observing average reaction times (RTs) across trials. A more powerful way to distinguish the underlying cognitive processes is by analyzing RT distributions, an approach that has been used successfully in previous studies. For example, Chetverikov et al. (2016, 2017) looked at RT distributions to test how different properties of previously observed distractor distributions (e.g., shape) influence search times. Furthermore, Sung (2008) analyzed RT distributions for displays of different set sizes to distinguish parallel from serial mechanisms in visual selection. Following this approach, in the current study, we compared not only the average RTs but also the RT distributions of trials in different conditions under the SIT and the MIT hypothesis. Critically, we simulated individual trials based on the predictions of two hypotheses by means of a drift-diffusion model (Ratcliff & McKoon, 2008) and compared the simulated data to the obtained data. We implemented a visual-search task based on the additional-singleton paradigm (Theeuwes, 1992). Participants first kept two colors in working memory, after which they searched for a colored target shape among a colored distractor shape and, in one experiment (Experiment 1), a gray distractor shape. The color of the target and the (colored) distractor was manipulated to match or not match the memorized colors.

Overall, both the SIT and MIT hypotheses predict faster RTs on target-match trials (i.e., only the target color matches one of the memory colors), and slower RTs on distractor-match trials (i.e., only the distractor color matches one of the memory colors). However, the SIT and MIT hypotheses make different predictions about what happens on individual trials. Specifically, when the target matches a VWM color, the MIT hypothesis predicts that attention is always guided toward the target; in contrast, the SIT hypothesis predicts that attention is only guided toward the target on 50% of trials, because there is only a 50% chance that the target color serves as an attentional template.

Furthermore, we also manipulated the congruency between the target and the distractor to investigate whether both memory colors guide attention. Inside the target, the orientation of a line-segment was either congruent or incongruent, with a line-segment inside the (colored) distractor. The MIT hypothesis predicts the strongest congruency effect on both-match trials (i.e., both the target and the distractor match the memory colors), because attention is simultaneously guided towards both the target and the distractor. Therefore, when the line-segment orientations of target and distractor are congruent, it is easier to report the orientation even though attention is partly drawn to the distractor, resulting in reduced RTs and error rates. In contrast, in the incongruent condition, there is more cognitive conflict caused by the different orientation of the matching distractor, resulting in increased RTs and error rates. The SIT hypothesis does not predict that attention is guided simultaneously towards the target and the distractor, and therefore does not predict an especially strong congruency effect on both-match trials.

When it comes to the RT distribution of individual trials, the MIT hypothesis predicts that the distribution for both-match and non-match (i.e., neither the target nor the distractor match the memory colors) trials are the same, or at least very similar: On both-match trials, attention is guided toward both the target and the distractor, and the resulting facilitation and interference should approximately cancel each other out, resulting in an RT distribution that is similar to the condition where no color matches the VWM items. In contrast, under the SIT hypothesis, on both-match trials, attention is guided either toward the target, resulting in fast RTs, or toward the distractor, resulting in slow RTs, but never to both at the same time. Thus, the distribution for both-match trials is expected to be wider than that for non-match trials.^{Footnote 1} We built drift-diffusion models of individual trials to simulate the two hypotheses’ predictions about RT distributions, and compared these with the collected data.

To foresee the results: The data by-and-large favor the predictions of the MIT hypothesis over the SIT hypothesis.

Experiment 1

Preregistration

Before conducting the experiment, we pre-registered the experimental designs on the Open Science Framework (OSF). A detailed pre-registration of the experiment is available at https://osf.io/sy7n8/. All deviations from the preregistration are mentioned below.

Method

Participants

We conducted a power analysis based on the results of a replication of Hollingworth and Beck (2016) as performed by Frătescu et al. (2019). Here the authors found that the effect size of the Distractor condition was f = 0.65. A power analysis conducted with G*Power (Faul et al., 2007) revealed that in order for this effect to be detected with a power of 95% and an alpha of .05, a sample of only seven participants would be required. Although this study is not identical to ours, this power analysis shows that memory-driven capture effects are strong and can be detected with few participants. However, our aim was to collect highly precise measurements that we could also use for computational modeling. In addition, we were interested in a modulation of the memory-driven capture effect by orientation congruency, and we had no a priori prediction about the strength of this modulatory effect. Therefore, we decided to collect at least 30 participants per experiment, which we felt confident would provide sufficient statistical power.

Thirty-five first-year psychology students (aged from 18 to 23 years; 18 female, 17 male) from the University of Groningen participated in exchange for course credits. All participants had normal or corrected-to-normal acuity and color vision. The study was approved by the local ethics review board of the University of Groningen (18123-S). Participants provided written informed consent before the start of the experiment.

Stimuli, design, and procedure

Participants were seated in a dimly lit, sound-attenuated testing booth, behind a computer screen on which the stimuli appeared at a viewing distance of approximately 62 cm. Stimuli were presented on a 27-in. flat-screen monitor at a refresh rate of 60 Hz running OpenSesame (version 3.2; Mathôt et al., 2012). Each trial started with a 500-ms fixation display, followed by a 1,000-ms memory display, consisting of two color disks (2.7° visual angle) placed in the middle of the screen to the left and the right of the fixation dot, with an eccentricity of 5.4° visual angle (Fig. 1). The memory colors were randomly drawn from an HSV (hue-saturation-value) color circle with full value (i.e., brightness) and saturation for each hue (luminance ranged between 49 cd/m² and 90 cd/m²), with the restriction that colors were at least 30° away from each other on the color circle. Participants were instructed to remember the exact colors of the items, and not the color category, to discourage verbalization.

Following a 200-ms fixation display, the search display was presented and remained visible until a response was given. The search display consisted of three shapes (1.3° visual angle): one diamond-shaped, colored target; one square-shaped, colored distractor; and another square-shaped, gray distractor, all placed around the fixation dot, with an eccentricity of 5.4° visual angle. The colors of the target (diamond) and the colored distractor (square) either matched or did not match the remembered color depending on the Target-Color-Match (Match, Non-Match) and Distractor-Color-Match condition (Match, Non-Match), resulting in four types of trials: Non-Color-Match (i.e., target-color-non-match, distractor-color-non-match), Target-Color-Match (i.e., target-color-match, distractor-color-non-match), Distractor-Color-Match (i.e., target-color-non-match, distractor-color-match), and Both-Color-Match (i.e., target-color-match, distractor-color-match). All shapes in the search display contained a line segment (1.1° visual angle) that was tilted 22.5° clockwise or counterclockwise from a vertical orientation. The line segments in the target and the colored distractor were tilted in the same (Congruent) or a different (Incongruent) direction depending on the Orientation-Congruency condition. The line segment inside the gray distractor was chosen randomly, and was not analyzed.

In our experiment, a color match was always exact; that is, when participants memorized a shade of green, on a Target-Color-Match trial, the visual-search target was always the exact same shade of green. However, this is not necessary for memory-driven capture to occur: both exact and inexact color matches lead to memory-driven capture (e.g., Hollingworth & Beck, 2016; see also our own supplementary analysis in the Open Science Framework).

Participants indicated the orientation of the line segment within the diamond by clicking either the left or the right mouse button as quickly and accurately as possible. Feedback was given for 500 ms immediately following the response: a green dot for a correct response, or a red dot for an incorrect response. Each trial ended with a memory test, in which participants selected the exact color they memorized in the color circle. They did this twice, once for each memorized color. Visual feedback followed, comparing the colors they selected with those that they actually saw. The accuracy of each memory test was recorded as memory precision.

The three factors (Target-Color-Match, Distractor-Color-Match, Orientation-Congruency) were mixed randomly within blocks. Participants completed eight blocks of 32 trials each (256 trials in total), preceded by one practice block of 32 trials that was excluded from analysis.

Data processing

Trials with RTs shorter than 200 ms and longer than 2,000 ms were excluded. Next, participants were excluded from analyses if their accuracy on the search task was less than .7. (These criteria were not preregistered. We added them because our preregistered criteria failed to exclude some data points that were clearly unsatisfactory, such as participants who scored at chance level on the search task.) No participants were excluded based on our preregistered criterion of having a mean RT that deviated from more than 2.5 SD from the grand mean. Only RT data of correct trials were analyzed. Thirty participants and 7,478 trials (of 8,960) remained for further analysis.

Data analysis

The data were analyzed using the JASP software package (version 0.9; JASP Team, 2018) with the default settings, with Target-Color-Match (Match, Non-Match), Distractor-Color-Match (Match, Non-Match), and Orientation-Congruency (Congruent, Incongruent) as factors. (This deviates slightly from the preregistration, in which we treated Color-Match as a single factor with four levels.) We used an inclusion Bayes Factor (BF) based on matched models (Rouder et al., 2009) to quantify evidence for effects.

Following Lee and Wagenmakers (2013), we considered BFs between 1 and 3 or between .3 and 1 as indicators of “anecdotal” evidence in favor of the alternative (H₁) or the null hypothesis (H₀), respectively; BFs between 3 and 10 or between .1 and .33 are indicators of “moderate” evidence; BFs between 10 and 30 or between .03 and .1 are indicators of “strong” evidence; and BFs between 30 and 100 or between .01 and .03 are indicators of “very strong” evidence of H₁ or H_0.

Results and discussion

Search reaction times (RTs)

Analyses revealed very strong evidence for the effect of Target-Color-Match (BF₁₀ = 3.30×10²⁴) and Distractor-Color-Match (BF₁₀ = 4.07 ×10¹⁵), such that RTs were faster when the target matched the memory color, and slower when the distractor matched the memory color (Fig. 2). Moreover, we found moderate evidence for the effect of Orientation-Congruency (BF₁₀ = 7.19), suggesting that RTs were faster on congruent trials than on incongruent trials. No interaction effect between the factors was found (all BF₁₀ < .06). (We also performed a supplementary analysis that included Memory Precision, based on a median split, as an additional factor. This revealed that memory precision of the VWM contents did not affect RTs or interact with any of the other factors. For more information, see Open Science Framework.)

RT distributions

To test whether only one (i.e., SIT) or both (i.e., MIT) of the color items maintained in working memory served as an attentional template, we analyzed the RT distributions for the Both-Color-Match and Non-Color-Match trials. According to the SIT hypothesis, on Both-Color-Match trials, attention is guided by the target on some trials, which leads to faster RTs, while on other trials attention is guided by the distractor, which leads to slower RTs. Therefore, the Both-Color-Match trials should result in a bimodal distribution (i.e., wider than that of the Non-Color-Match trials) according to the SIT hypothesis. In contrast, the MIT hypothesis predicts that on Both-Color-Match trials, both the target and the distractor guide attention, thus resulting in a unimodal distribution (i.e., resembling that of the Non-Color-Match trials).

To test this, an Inverse Gaussian distribution was fit to the RTs per condition for each participant. The scale parameter, which reflects the width of the distributions, was analyzed using an evidence T-test. We found moderate evidence that the RT distributions for the Both-Color-Match and the Non-Color-Match trials were equally wide (BF₀₁ = 4.05, error % = .002), as predicted by the MIT hypothesis.

Accuracy

Analyses revealed moderate evidence for the effect of Target-Color-Match (BF₁₀= 3.02) and Distractor-Color-Match (BF₁₀ = 6.58), such that the overall search accuracy was higher when the target matches the memory color, and lower when the distractor matches the memory color (Fig. 3). Furthermore, we found very strong evidence for the effect of Orientation-Congruency on accuracy (BF₁₀ = 4.50×10¹³), showing that search performance was more accurate when the orientation of the line-segment in a target was congruent with that in a distractor than when they were incongruent. No evidence for any interaction effect between the factors was found (all BF₁₀ < 2.0). (A supplementary analysis that included Memory Precision as an additional factor revealed that memory precision did not affect accuracy or interact with any of the other factors. For more information, see the Open Science Framework.)

In summary, search performance increased (i.e., became faster and more accurate) when the target matched one of the colors held in VWM, but decreased when the distractor matched the VWM item. Moreover, the RT distribution for both-match trials and no-match trials are similar, which suggests that both color items that were maintained in the VWM draw attention. These results are consistent with the assumptions of the MIT hypothesis, which we will address in the General discussion.

Unlike we predicted, however, we did not find that the effect of Orientation-Congruency was especially strong when both the target and the distractor matched, compared to other conditions. We suspected that the presence of the gray (unrelated) color might have affected the processing of the target and the distractor in visual search. Therefore, in the follow-up experiment, we removed the gray color in the search display.

Experiment 2

In Experiment 2, we removed the gray color item (the unrelated item) from the search display. We reasoned that this would increase the strength of the Orientation-Congruency effect, because there were now only two line segments in the display, thus providing a stronger test of our prediction that the effect of Orientation-Congruency should be strongest when both the distractor and the target matched the VWM colors. Furthermore, we wanted to replicate the main results of Experiment 1.