Introduction

Numbers play a critical role as symbols representing magnitude or quantity in our daily life. It is therefore important to investigate how our brains process the magnitude information conveyed by numbers that we encounter. One of the most often studied phenomena in the field of number processing is the so-called size congruity effect, which is also commonly referred to as the numerical Stroop effect (e.g., Dadon & Henik, 2017; Henik & Tzelgov, 1982; Pansky & Algom, 1999). In a typical size congruity experiment, participants are visually presented with pairs of numbers varying in numerical and physical magnitudes, and they have to perform a magnitude judgment regarding the numerical value or physical size of a pair of numbers on each trial. Either numerical dimension or physical dimension is task-relevant on a given trial depending on task demands, such that one dimension is relevant while the other is irrelevant and should be ignored. The typical experimental results show that participants respond more quickly and more accurately if numerical value and physical size of the paired numbers are congruent (with the numerically larger number appearing in a physically larger size) than when they are incongruent (with the numerically larger number appearing in a physically smaller size).

This size congruity effect is a well-established phenomenon, and has commonly been interpreted to reflect that the processing of numerical value and physical size of numbers interacts with each other during magnitude judgments (e.g., Arend & Henik, 2015; Henik & Tzelgov, 1982; Pansky & Algom, 1999; Reike & Schwarz, 2017; Santens & Verguts, 2011; Schwarz & Heinze, 1998). The size congruity effect may have been explained at least in part by the influential theory of magnitude proposed by Walsh (2003). According to this theory, there is a general substrate in the human brain (e.g., inferior parietal cortex) for representing magnitudes in different dimensions, such as numerical value and physical size. Different magnitudes share common metrics through such a generalized magnitude system, so that both numerical and physical dimensions of a number stimulus would be mapped into an analogue magnitude representation, leading to the classic size congruity effect in a numerical Stroop task. It is important to note, however, that besides sharing a generalized magnitude system for an integrated analog representation, different dimensional magnitudes may also be processed in parallel and have separate representations implemented by dimension-specific processes, with the generalized magnitude system being only a partly shared representational mechanism for different magnitude dimensions (Arend & Henik, 2015; Cappelletti et al., 2009, 2011). Accordingly, the size congruity effect can be considered a behavioral outcome of the interactions between numerical and physical magnitudes taking place at early processing stages where a partly shared representation for these two different magnitudes is formed and/or at later processing stages where the separate representations for numerical and physical magnitudes lead to two separate decision alternatives that may then induce facilitation or interference during overt response preparation and execution.

Recently, attentional capture was considered a contribution to the production of the size congruity effect in a numerical Stroop task (Risko et al., 2013). Based on the evidence that larger objects capture attention in visual search (e.g., Proulx, 2010; Proulx & Egeth, 2008; Proulx & Green, 2011), Risko et al. (2013) assumed that the physically larger number initially captures attention more than does the physically smaller number when they are presented simultaneously, and that such an involuntary shift of spatial attention toward the physically larger number contributes to the size congruity effect by creating an asymmetry in the temporal order with which the numbers are processed. The size-based attentional capture assumption is supported by the findings of Risko et al. (2013) showing that the first saccade was generated more rapidly to the physically larger number than to the physically smaller number when they were presented simultaneously, and that the more rapidly the saccade was generated, the more likely it was directed toward the physically larger number. More importantly, Risko et al. (2013) manipulated the temporal order of the number onsets in a numerical comparison task (e.g., indicating which of the two numbers in a pair is numerically larger) and found that the magnitude of the size congruity effect was significantly reduced when the physically smaller number had the first onset compared to when both numbers of a pair had a simultaneous onset. Considering that the earlier onset of the physically smaller number counteracts the influence of attentional capture by the physically larger number, the finding was therefore interpreted as evidence suggesting that size-based attentional capture contributes to the size congruity effect. The importance of the research by Risko et al. (2013) is that it provides preliminary evidence that the interaction between numerical value and physical size is susceptible to the mediation by spatial attention shifts, highlighting a critical role for spatial shifts of attention in generating the size congruity effect. However, one methodological limit of the study of Risko et al. (2013) is that they did not include a physical task (e.g., indicating which of the two numbers in a pair is physically larger) in their study. Hence, the contribution of spatial attention shifts to the size congruity effect in a physical task remains to be established. If involuntary shifting of attention toward the location of the physically larger number really plays a crucial role in producing the size congruity effect, we may expect the attentional contribution for both numerical and physical tasks in which spatial attention shifts between numbers are supposed to take place.

Recently, Arend and Henik (2015) questioned the hypothesis of the attentional contribution to the size congruity effect in magnitude comparisons. They argued that if attention really plays a major role in producing the size congruity effect, then attention shifts by task instructions (i.e., choose the larger number vs. choose the smaller number) should modulate the size congruity effect in both numerical and physical tasks. Specifically, it was predicted that if attention is an important factor for the size congruity effect in a given task, the magnitude of the size congruity effect would be greater under instructions of “choose the larger number” than under instructions of “choose the smaller number.” However, it was found that the modulation of the size congruity effect by instructions occurred only in the numerical comparison task but not in the physical comparison task. The finding was interpreted by Arend and Henik as being incompatible with the attentional account of the size congruity effect proposed by Risko et al. (2013). We suggest that the null effect of task instructions on the size congruity effect in the physical comparison task only indicates that top-down control of attention by task instructions may not affect interactions between physical and numerical magnitudes during physical comparisons. It does not necessarily exclude the possibility that bottom-up biases of spatial attention, such as an involuntary shift of attention to the location of the physically larger number, may play a role in generating the size congruity effect in a physical task. Given that spatial shifts of attention between numbers in a numerical Stroop task consist of both top-down and bottom-up forms and that top-down but not bottom-up attention shifts are manipulated by task instructions, it is possible that the manipulation of attention by task instructions might not be most suitable for assessing the contribution of spatial attention shifts to the size congruity effect. Hence, a more elegant assessment of the contribution of spatial attention shifts is essential.

The present study aimed to demonstrate a contribution of spatial attention shifts in the size congruity effect with both numerical and physical tasks. To assess the role for spatial attention shifts in the size congruity effect, we compared the size congruity effects between sequential and simultaneous presentations of number pairs. Unlike Risko et al. (2013), however, we sequentially presented the paired numbers at central fixation with an interstimulus interval (ISI) of 1,000 ms between them in the sequential presentation mode. The long ISI would prevent the number stimuli sequentially presented at the same location from generating visual masking effects. Note that there would be no opportunity for any spatial shifts of attention between numbers to occur in the sequential presentation condition of our present study, because the central presentation of number stimuli would make participants hold their attention at fixation without shifting it to other locations when performing the task at hand. This therefore enables us to assess the putative contribution of spatial attention shifts by comparing the size congruity effects between sequential and simultaneous presentations of number pairs in a given task. If spatial shifts of attention between numbers really play a role in generating the size congruity effect, then the magnitude of the size congruity effect would be expected to be reduced or eliminated when the two numbers in a pair are presented sequentially, as compared to when they are presented simultaneously.

The present study broadens the existing literature by examining an attentional contribution to the size congruity effect across three experiments using different task paradigms. Participants were sequentially or simultaneously presented with a pair of single-digit Arabic numbers whose numerical and physical magnitudes varied independently. In Experiment 1, participants’ task was to compare the two numbers in terms of their numerical values, irrespective of physical sizes. In Experiment 2, the task was instead to compare physical sizes rather than numerical values of the two numbers. In Experiment 3, the task was to decide whether the two numbers in a pair were the same or different in terms of their physical sizes, irrespective of numerical values. If spatial shifts of attention between numbers contribute to the size congruity effect, then it should be possible that the size congruity effect is reduced or eliminated when the two numbers of a pair are presented sequentially rather than simultaneously, with any of the three tasks.

Experiment 1

The aim of this experiment was to replicate the attentional contribution to the size congruity effect during numerical comparisons using an adaptation of Risko et al.’s (2013) paradigm. Participants were shown a pair of single-digit numbers differing in both physical size and numerical value on each trial, and were required to compare the two numbers in terms of their numerical values, irrespective of physical sizes. The pair of numbers could be visually presented either sequentially or simultaneously. Critically, in the sequential presentation mode, the two numbers of the pair were presented sequentially at central fixation, rendering it unlikely for any spatial shifts of attention to occur. We predicted that the magnitude of the size congruity effect would be reduced or even eliminated in the sequential presentation condition where spatial attention shifts would be precluded, as compared with the condition of simultaneous presentation in which spatial attention shifts and their influence on the size congruity effect are supposed to take place.

Method

Participants

A total of 16 students (eight females; 18–25 years of age) from Hangzhou Normal University were recruited to participate in this experiment for monetary compensation. On the basis of previous studies showing the attentional modulation of the size congruity effect in a numerical comparison task (Arend & Henik, 2015; Risko et al., 2013), a power analysis (MorePower; Campbell & Thompson, 2012) indicated that a sample size of 16 participants was needed to detect an effect size of ηp2 = 0.36 with 80% power at an α level of .05. All participants were right-handed and reported having normal or corrected-to-normal vision. Informed consent was obtained from each participant prior to the experiment, which was conducted in accordance with the tenets of the Declaration of Helsinki and local ethics regulations.

Apparatus and stimuli

The experiment was controlled by E-Prime software. Responses were made by pressing keys on a standard keyboard with the participant’s right hand. Stimuli were presented on a 17-in. CRT monitor with a resolution of 1,024 × 768 pixels and a 100-Hz refresh rate. The stimuli were created on the basis of those employed by Dadon and Henik (2017). The stimuli were single-digit Arabic numbers from 1 to 9 with the exclusion of the number 5. We used these stimuli to create six different number pairs (1-2, 3-4, 6-8, 7-9, 2-7, 3-8). The paired numbers were presented in Arial font, with three possible pairs of font sizes used: 40-44, 56-67, 52-76. Thus, the two numbers of a pair presented always differed from each other in both numerical value and physical size. All stimuli were presented in black on a white background at a viewing distance of approximately 50 cm.

Procedure and design

Participants initiated each trial by pressing the space bar. Each trial began with the display of a black central fixation cross for 1,000 ms. Then, a pair of single-digit numbers could be presented either sequentially or simultaneously on a computer screen. Participants were instructed to indicate the numerically larger number of the pair while ignoring the physical sizes of the numbers. In the sequential presentation mode, the first number was presented for 1,000 ms at the center of the screen, followed by an ISI of 1,000 ms, followed by the central presentation of the second number that remained in view until the participant made a speeded response regarding the temporal order (first vs. second) of the onset of the numerically larger number (see Fig. 1). Participants were asked to press the “F” and “J” keys for the “first” and “second” responses, respectively. Response time (RT) was measured in milliseconds from the onset of the second number. In the simultaneous presentation mode, the two numbers were simultaneously presented to the left and right side of the screen (with a center-to-center distance of 13.5 cm) and remained visible until response. Participants were required to respond by pressing the “F” key if the numerically larger number was presented to the left side, and the “J” key if the numerically larger number was presented to the right side.

Fig. 1
figure 1

Schematic illustration of the trial sequence and example stimuli for Experiments 1–3. The two numbers of a pair could be presented sequentially at the center of the screen (panel A), or simultaneously to the left and right side of the screen (panel B)

The first number was numerically larger than the second number on half the trials in the sequential presentation mode, and the left number was numerically larger than the right number on half of the trials in the simultaneous presentation mode. The two presentation modes were blocked and the order of blocks was counterbalanced across participants. In each block, there were two types of trials, each defined by the congruity between physical size and numerical value of the paired numbers. On congruent trials, the numerically larger number in the pair was physically larger. On incongruent trials, the numerically larger number was physically smaller. The two types of trials occurred equally often and varied randomly within each block. The participants were encouraged to perform the numerical comparison task as quickly and accurately as possible. Each participant completed two practice blocks of 24 trials each, followed by two experimental blocks of 144 trials each. The participants could take a short break between blocks.

Results and discussion

A repeated-measures analysis of variance (ANOVA) with presentation mode (sequential vs. simultaneous) and size congruity (congruent vs. incongruent) as within-subjects factors was conducted separately on RTs and accuracy rates (percentages of correct responses). Trials with incorrect responses were not included in the analysis of RTs. To account for outliers, median RTs for each subject in each condition were used in the analysis of RT data in all the experiments reported here. An analysis of the RTs showed that the main effect of presentation mode did not approach significance, F < 1. The main effect of size congruity was significant, F(1, 15) = 5.559, p = .032, ηp2 = .270, with faster performance on congruent trials than on incongruent trials. There was also a significant interaction effect, F(1, 15) = 5.506, p = .033, ηp2 = .269. Analysis of simple effects underlying this interaction revealed that while responses to simultaneously presented pairs of numbers were significantly faster on congruent trials (M = 487 ms) than on incongruent trials (M = 509 ms), F(1, 15) = 13.807, p = .002, ηp2 = .479, RTs to sequentially presented pairs of numbers did not significantly differ between these two types of trials (497 ms vs. 501 ms), F < 1. This pattern of RT results is presented in Fig. 2A.

Fig. 2
figure 2

Means of median response times (panel A) and mean accuracy rates (panel B) in the numerical comparison task of Experiment 1, as a function of size congruity and presentation mode. Error bars represent within-subjects 95% confidence intervals (Loftus & Masson, 1994)

The accuracy results generally mirrored the pattern in RTs. The main effect of presentation mode did not approach significance, F < 1. The main effect of size congruity was significant, F(1, 15) = 10.974, p = .005, ηp2 = .422, with more accurate performance on congruent trials than on incongruent trials. Critically, the size congruity effect varied as a function of whether the number pairs were presented sequentially or simultaneously, F(1, 15) = 5.306, p = .036, ηp2 = .261. Analysis of simple effects underlying such an interaction showed that while performance for simultaneously presented pairs of numbers was significantly more accurate on congruent trials (M = 98.7%) than on incongruent trials (M = 95.5%), F(1, 15) = 18.483, p = .001, ηp2 = .552, the accuracy of performance for sequentially presented pairs of numbers did not vary between the two types of trials (96.7% vs. 96.5%), F < 0.1. This pattern of accuracy results is presented in Fig. 2B.

The results showed that the size congruity effect was evident in the simultaneous presentation mode, whereas it was not obtained in the sequential presentation mode. It could be argued that the lack of a size congruity effect in the sequential presentation mode might be driven by responses on a subset of trials wherein the first number in a sequential number pair was 1 or 9. This is because the first number 1 or 9 would always be the numerically smaller or the numerically larger in a given number pair, making it possible for the participant to complete the numerical task without processing the second number in a number pair presented sequentially. To overcome this concern, we performed supplementary analyses on the data in the sequential presentation condition, excluding trials on which the first number was either 1 or 9 (which was 16.7% of the trials in the sequential presentation condition). The outcome of the analyses still showed that the size congruity effect didn’t emerge for either RT and accuracy (ps > .282). Hence, the absence of a size congruity effect in the sequential presentation mode was not likely driven by responses on some trials wherein the numerical task could be solved without performing a number comparison.

It is noteworthy, however, that by using only the sequential presentation mode similar to that included in the present experiment, Ben-Meir et al. (2012) observed the presence of a size congruity effect in the sequential presentation. While the exact causes of the discrepancy between the results of Ben-Meir et al. (2012) and ours are unclear, we speculate that the difference in stimulus sets and timing conditions of the two studies might be responsible for the different results. Thus, despite the null effects for the sequential presentation mode in our Experiment 1, we do not intend to make a claim that a size congruity effect cannot occur in comparative judgments of the numerical values of number pairs sequentially presented at fixation, especially considering the evidence for such an effect reported previously (Ben-Meir et al., 2012). With this view, although the observed distinction between sequential and simultaneous presentations indicates that spatial attention shifts between numbers contribute to the size congruity effect, prudence is necessary in drawing the conclusion from current data that attention is the sole contributing factor to the production of the size congruity effect. We expand on this idea later in the General discussion section.

The results of our first experiment suggest an attentional contribution to the size congruity effect in a numerical comparison task, which is in line with the findings of Risko et al. (2013). In the following experiments, we sought to provide further evidence for the attentional contribution to the size congruity effect by using a physical comparison task (Experiment 2) and a physical matching task (Experiment 3). In the physical comparison task, participants were asked to decide which of the two numbers was physically larger while ignoring their numerical values. In the physical matching task, participants were required to judge whether the two numbers were the same or different in physical size, irrespective of their numerical values. The use of two different task paradigms serves to enhance the generality of our results. If spatial shifts of attention between numbers do indeed contribute to the size congruity effect, then the attentional contribution would also be expected for a physical task.

Experiment 2

Method

This was very similar to the method of Experiment 1, except that the experimental task was to compare physical sizes rather than numerical values of a pair of numbers presented sequentially or simultaneously. Here, participants were instructed to indicate the physically larger number of the pair while ignoring the numerical values of the numbers on each trial, by pressing the “F” or “J” key on the computer keyboard. They had to perform the physical comparison task as quickly and accurately as possible. A new group of 16 volunteers (seven females; 18–25 years of age) participated in this experiment. Each participant completed two practice blocks of 24 trials each, followed by two experimental blocks of 144 trials each. Participants could take a short pause between blocks.

Results and discussion

Two separate ANOVAs were conducted to analyze the RTs and accuracy rates of the physical comparison task using presentation mode (sequential vs. simultaneous) and size congruity (congruent vs. incongruent) as within-subjects factors. An analysis of the RTs revealed that the main effect of presentation mode did not approach significance, F < 1. The main effect of size congruity was significant, F(1, 15) = 33.365, p < .001, ηp2 = .690, with faster performance on congruent trials than on incongruent trials. There was also a significant interaction between the two factors, F(1, 15) = 11.742, p = .004, ηp2 = .439. Analysis of simple effects underlying the interaction indicated that while responses to simultaneously presented number pairs were significantly faster on congruent trials (M = 593 ms) than on incongruent trials (M = 693 ms), F(1, 15) = 28.116, p < .001, ηp2 = .652, RTs to sequentially presented number pairs were only marginally faster on congruent trials (M = 609 ms) than on incongruent trials (M = 632 ms), F(1, 15) = 4.114, p = .061, ηp2 = .215. As can be seen in Fig. 3A, the magnitude of the size congruity effect was markedly smaller in the sequential presentation mode (M = 23 ms) than in the simultaneous presentation mode (M = 100 ms), F(1, 15) = 11.742, p = .004, ηp2 = .439.

Fig. 3
figure 3

Means of median response times (panel A) and mean accuracy rates (panel B) in the physical comparison task of Experiment 2, as a function of size congruity and presentation mode. Error bars represent within-subjects 95% confidence intervals (Loftus & Masson, 1994)

An analysis of the accuracy data showed that performance was significantly less accurate overall in the sequential presentation mode than in the simultaneous presentation mode, F(1, 15) = 21.874, p < .001, ηp2 = .593. Performance was significantly more accurate on congruent trials than on incongruent trials, F(1, 15) = 46.097, p < .001, ηp2 = .754. However, this size congruity effect on physical-comparison accuracy did not vary as a function of whether the number pairs were presented sequentially or simultaneously, as indicated by the lack of an interaction between size congruity and presentation mode, F < 1. The results of accuracy are depicted in Fig. 3B.

The results showed that although the magnitude of the size congruity effect on accuracy was comparable between the simultaneous and sequential presentations, the effect of size congruity on RTs was much smaller in the sequential presentation mode than in the simultaneous presentation mode. This pattern of results suggests that spatial shifts of attention also contribute to the size congruity effect in the physical comparison task. Before delving into further discussion, we would like to corroborate the attentional contribution to the size congruity effect in Experiment 3 by using a physical matching task in which participants had to decide whether the two numbers in a pair were the same or different in terms of their physical sizes. The physical matching task, which is also usually referred to as the physical same-different task, is commonly used in the field of number processing (e.g., Dehaene & Akhavein, 1995; Ganor-Stern & Tzelgov, 2008). In Experiment 3, we created congruent and incongruent conditions on “different” trials wherein the two numbers of a pair differed in both physical and numerical dimensions, so that the size congruency effect can be assessed by using the physical matching task. We suggest that interactions between numerical and physical magnitudes can explain how the size congruity effect would play out in the physical matching task. For example, given the interaction between numerical and physical magnitudes of a number taking place at early stages of perceptual processing (e.g., Walsh, 2003), a numerically smaller number would tend to be perceived as physically smaller, rendering it plausible that participants may have more judgment errors and/or slower response times on “different” trials when the numerically smaller number is actually physically larger (i.e., incongruent condition). Experiment 3 was devised with this idea in mind.

Experiment 3

Method

This was similar to the method of Experiment 2, except that the experimental task required physical matching rather than physical comparison for number pairs. Here, participants were instructed to judge whether the two numbers in a pair were the same or different in terms of their physical sizes, irrespective of numerical values. They were required to press the “F” and “J” keys for the “same” and “different” responses, respectively. The physical sizes of the paired numbers were the same on half of the trials, and they were different on the other half. The pair of numbers always differed in numerical values on every trial. The congruent condition (when the physically larger number was also numerically larger) and the incongruent condition (when the physically larger number was numerically smaller) occurred equally often and varied randomly within the “different” trials. A new group of 16 volunteers (eight females; 18–25 years of age) participated in this experiment. Each participant completed two practice blocks of 24 trials each, followed by two experimental blocks of 144 trials each. Participants could take a short break between blocks.

Results and discussion

Two separate ANOVAs were conducted to analyze the RTs and accuracy rates for “different” trials using presentation mode (sequential vs. simultaneous) and size congruity (congruent vs. incongruent) as within-subjects factors. Figure 4A depicts the means of median correct RTs for “different” trials across conditions in the physical matching task. An analysis of the RTs showed that the main effect of presentation mode was significant, F(1, 15) = 33.250, p < .001, ηp2 = .689, with faster performance in the sequential presentation condition than in the simultaneous presentation condition. The main effect of size congruity was also significant, F(1, 15) = 18.432, p = .001, ηp2 = .551, with faster responses on congruent trials than on incongruent trials. The interaction effect between the two factors did not approach significance, F(1, 15) = 1.631, p = .221, ηp2 = .098. However, given our a priori prediction of the reduction or elimination of the size congruity effect in the sequential presentation, we followed this omnibus analysis with planned comparisons within each presentation mode. These analyses revealed that while responses to simultaneously presented numbers were significantly faster on congruent trials (M = 827 ms) than on incongruent trials (M = 907 ms), F(1, 15) = 6.047, p = .027, ηp2 = .287, RTs to sequentially presented numbers did not significantly differ between the two types of trials (696 ms vs. 714 ms), F < 1.

Fig. 4
figure 4

Means of median response times (panel A) and mean accuracy rates (panel B) for “different” trials in the physical matching task of Experiment 3, as a function of size congruity and presentation mode. Error bars represent within-subjects 95% confidence intervals (Loftus & Masson, 1994)

Figure 4B depicts the mean accuracy rates for “different” trials across conditions in the physical matching task. The main effect of presentation mode did not approach significance, F(1, 15) = 1.407, p = .254, ηp2 = .086. The main effect of size congruity was significant, F(1, 15) = 5.444, p = .034, ηp2 = .266, with more accurate performance on congruent trials than on incongruent trials. Critically, the size congruity effect varied as a function of whether the number pairs were presented sequentially or simultaneously, F(1, 15) = 5.634, p = .031, ηp2 = .273. Analysis of simple effects underlying such an interaction showed that while performance for simultaneously presented pairs of numbers was significantly more accurate on congruent trials (M = 70.1%) than on incongruent trials (M = 63.9%), F(1, 15) = 12.273, p = .003, ηp2 = .450, the accuracy of performance for sequentially presented pairs of numbers did not significantly differ between the two types of trials (64.1% vs. 63.0%), F < 1.

It is noteworthy that mean percent correct on the physical matching task was markedly lower on “different” trials (65.3%) than on “same” trials (80.1%), F(1, 15) = 7.787, p = .014, ηp2 = .342, while mean RT was significantly faster on “different” trials (786 ms) than on “same” trials (871 ms), F(1, 15) = 6.637, p = .021, ηp2 = .307. Thus, the less accurate performance on “different” trials may be caused at least in part by a speed-accuracy trade-off. Moreover, we speculate that the poorer performance on “different” trials might also be partially driven by a response bias that was likely to occur for our participants in the current physical matching task. That is, although the “same” and “different” trials occurred equally often in this experiment, participants may more likely make a “same” response to a given trial, thereby producing more errors on “different” trials.

Overall, the results showed that when the task required physical matching, the size congruity effect was evident for both RT and accuracy in responding “different” for simultaneously presented number pairs, whereas such an effect was not found for either RT or accuracy in responding “different” for sequentially presented number pairs. This pattern of results provides further evidence for a contribution of spatial shifts of attention in generating the size congruity effect in a physical task.

General discussion

The size congruity effect indicates that the symbolic magnitude (numerical value) and nonsymbolic magnitude (physical size) of numbers interact with each other in magnitude judgments, supporting the notion that numerical value and physical size partly share an integrated analog representation while also occupying separate representations that may interact later at the response decision level (Arend & Henik, 2015; Cappelletti et al., 2009, 2011; Walsh, 2003). Recent research by Risko et al. (2013) shows that attentional capture by the physically larger number modulates the interactions between numerical and physical magnitudes during numerical comparisons, suggesting a contribution of spatial attention shifts to the size congruity effect. However, it remains unclear whether there is also an attentional contribution to the size congruity effect in a physical task. By directly comparing the size congruity effects between sequential and simultaneous presentations of number pairs, the present study extends the previous work by corroborating the attentional contribution to the size congruity effect in both numerical and physical tasks. Across three experiments using either a numerical or a physical task, we consistently found that the magnitude of the size congruity effect was reduced or even eliminated in the sequential presentation mode compared to that in the simultaneous presentation mode. Given that in the sequential presentation mode the paired numbers appeared at central fixation and consequently any spatial shifts of attention between numbers were precluded, the decrement in the size congruity effect implies the crucial importance of spatial attention shifts for the observed pattern of interactions between numerical and physical magnitudes of numbers. Our results therefore extend previous work by establishing the attentional contribution to the size congruity effect in both numerical and physical tasks. The present findings also add strong support for the notion that magnitude judgments of simultaneously presented number pairs are essentially a visual search task in which spatial shifts of attention between numbers contribute to the size congruity effect (Risko et al., 2013; Sobel et al., 2016).

It should be noted that no evidence for the size congruity effect in the sequential presentation mode was obtained in Experiments 1 and 3. Given that in the sequential presentation mode the first number must be actively maintained in working memory for a later judgment (Ben-Meir et al., 2012), it could be argued that the absence of the size congruity effect might reflect the failure to maintain the task-irrelevant magnitude dimension of the first number in working memory. However, we considered this explanation for the absence of the size congruity effect in the sequential presentation mode as unlikely. The results of our Experiment 2, along with those of Ben-Meir et al. (2012), showed the presence of the size congruity effect in the sequential presentation mode, providing converging evidence suggesting that both the relevant and irrelevant magnitude dimensions of the first number stimulus are encoded and stored in working memory. This is also consistent with the broader notion that working memory stores a multiple-feature stimulus in an object-based manner, with all feature dimensions being encoded and maintained in working memory regardless of their relevance to the task (e.g., Luck & Vogel, 1997, 2013). Accordingly, it is possible that in our study the irrelevant magnitude dimension of the first number in the sequential presentation mode is automatically stored in working memory and that the absence of the size congruity effect for sequentially presented number pairs in Experiments 1 and 3 simply reflects a lack of sufficient sensitivity and power. Note, though, that this does not invalidate the main results of these experiments, which show a clear distinction between sequential and simultaneous presentations of number pairs in the size congruity effect.

Moreover, we propose that the decrement in the size congruity effect in the sequential presentation mode is not likely attributed to working memory processing of the first number’s relevant and irrelevant magnitude dimensions. In other words, the internal maintenance of the first number’s relevant and irrelevant magnitudes in working memory per se would not reduce the size congruity effect in a numerical Stroop task, as compared to when such magnitudes are externally perceived without working memory processing (i.e., in the simultaneous presentation mode). This claim is grounded on previous research indicating that the magnitude of the Stroop effect driven by working memory maintenance is comparable to that of the classic Stroop effect (e.g., Chen et al., 2017; Kiyonaga & Egner, 2014; Pan et al., 2019). According to such studies, it would have predicted that the size congruity effect in a numerical Stroop task should be of equivalent magnitude for the sequential presentation mode (i.e., the working memory numerical Stroop task) and the simultaneous presentation mode (i.e., the classic numerical Stroop task). However, here we actually found that the Stroop congruity effect was significantly reduced or even eliminated in the sequential presentation mode relative to that in the simultaneous presentation mode. As such, we suggest that working memory mechanisms cannot account for the present results. Instead, given that spatial shifts of attention between numbers are present for the simultaneous presentation mode but absent for the sequential presentation mode, it is conceivable that the observed distinction between sequential and simultaneous presentations indicates a critical role for spatial attention shifts in producing the size congruity effect in a numerical Stroop task.

What underlying mechanisms could possibly lead to this attentional contribution to the size congruity effect during magnitude judgments? According to the theory of magnitude (Walsh, 2003), there exists an interaction between different magnitude dimensions of a stimulus at the early processing stage, with the representation of one magnitude dimension being modulated by magnitude information in another dimension. Given that spatial attention can facilitate stimulus processing at the attended location (e.g., Carrasco, 2011), it is possible that shifts of spatial attention, such as an involuntary shift of attention toward the location of the physically larger number, may enhance the early interaction between the numerical and physical magnitudes of the attended number (i.e., the physically larger number) in the simultaneous presentation mode, rendering the representation of the task-relevant magnitude more modifiable by the task-irrelevant magnitude (e.g., Reike & Schwarz, 2017). Consequently, when physical magnitude is the task-relevant dimension, a size-based attention shift increases the perceived physical size of the attended number on congruent trials and decreases its perceived physical size on incongruent trials; when numerical magnitude is the task-relevant dimension, the size-based attention shift increases the accessed numerical value of the attended number on both congruent and incongruent trials. As a result, the involuntary shift of attention toward the physically larger number facilitates magnitude judgments on congruent trials and impairs magnitude judgments on incongruent trials for both physical and numerical tasks, thereby contributing to the size congruity effect in the simultaneous presentation mode.

Nevertheless, we do not wish to argue that the size congruity effect reflects only the contribution of spatial attention shifts between numbers. A pure attentional account of the size congruity effect cannot explain the presence of a size congruity effect when the two numbers are presented sequentially at central fixation (Ben-Meir et al., 2012) or when only a single number is presented per trial (Santens & Verguts, 2011; Schwarz & Ischebeck, 2003; Tzelgov et al., 1992), since in such displays there should also be no opportunity for any spatial shifts of attention between numbers to take place. Instead, the size congruity effect observed in such displays may be produced merely by the interactions between physical and numerical magnitudes occurring at early and/or later processing stages (Arend & Henik, 2015; Cappelletti et al., 2009, 2011; Reike & Schwarz, 2017; Walsh, 2003), without any contribution of spatial attention shifts. As illustrated above, we suggest that spatial shifts of attention contribute to the size congruity effect for simultaneously presented number pairs through enhancement of the early interaction between physical and numerical magnitudes. Hence, it is conceivable that the contribution of spatial attention shifts between numbers is not the only source of the size congruity effect in the simultaneous presentation mode.

The current data can also be discussed from the perspective of the asymmetry in the size congruity effect. Firstly, our main results demonstrate an asymmetry in the size congruity effect as a function of presentation mode by showing that the size congruity effect in a given task was larger when number pairs were presented simultaneously rather than sequentially. Secondly, the observed size congruity effect was markedly greater in the physical comparison task (Experiment 2) than in the numerical comparison task (Experiment 1) for both sequential and simultaneous presentations. This finding extends previous work that established the asymmetry in the size congruity effect as a function of task only in the simultaneous presentation mode (e.g., Algom et al., 1996; Arend & Henik, 2015; Dadon & Henik, 2017; Fitousi & Algom, 2006; Pansky & Algom, 1999). Here, we showed that the size congruity effect in the sequential presentation mode was present during physical comparisons but absent during numerical comparisons, hence providing the first evidence for the task asymmetry in the size congruity effect during comparative judgments of sequentially presented numbers. The task asymmetry in the size congruity effect may reflect the influence of the asymmetry in the discriminability or saliency of the numerical and physical dimensions on comparative judgments of numbers (e.g., Algom et al., 1996; Fitousi & Algom, 2006). According to this view, the more salient dimension should affect comparative judgments of the less salient dimension, but not vice versa. Given that the dimensional saliency is determined by a comparison of feature values along a dimension of the paired numbers (e.g., Arend & Henik, 2015), the present findings of the task asymmetry in the size congruity effect for both sequential and simultaneous presentations suggest that such dimensional saliency can be computed not only when the two numbers in a pair are simultaneously presented in the scene, but also when they are sequentially presented without being eventually viewed together.

In summary, the present results show a clear distinction between sequential and simultaneous presentations of number pairs during magnitude judgments regarding numerical values or physical sizes, with the observed size congruity effect being significantly reduced or even absent in the sequential presentation mode. Because spatial attention shifts, such as an involuntary shift of attention toward the location of the physically larger number, should occur in the simultaneous presentation mode but not in the sequential presentation mode, the present findings are in support of the idea that spatial shifts of attention between numbers contribute to the production of the size congruity effect for both numerical and physical magnitude judgments.