Introduction

The classical “spotlight” metaphor of spatial attention describes the attended location in visual space as a place illuminated by a spotlight, in which the neurocognitive processing of the target is instantly facilitated (Posner, 1980). The facilitated cognitive processes are often evidenced by faster reaction times (RTs) and higher accuracies in the response to the target at the same location (the cued or cue valid condition), primed by an exogenous spatial cue, relative to the target at the opposite location of the spatial cue (the uncued or cue invalid condition). Crucially, the facilitatory spatial cueing effect is observed only when the stimulus onset asynchrony (SOA) between the cue and the target is short (e.g., < ~ 300 ms). When the SOA increases, however, an inhibitory effect appears in the way that RTs to the cued location is delayed compared with RTs to the uncued location (Chen, Fuentes, & Zhou, 2010; Klein, 2000; Posner & Cohen, 1984; Tipper & Kingstone, 2005). This inhibitory effect is termed inhibition of return (IOR), which slows down attentional reorienting to the previously attended (cued) location and thus increases the efficiency of foraging elsewhere.

The early facilitation and late inhibition revealed by the classical exogenous spatial cueing paradigm seems to suggest a hardwired temporal boundary between attentional orienting to the cued location and attentional reorienting to a new (uncued) location, and the locations seem to be consecutively illuminated by the spotlight in each of the two discrete phases. This impression, however, is challenged by a growing number of studies suggesting that the spotlight “blinks” rhythmically, leading to alternating cycles of improved and impaired behavioral performance at the cued and uncued locations (Dugué, Roberts, & Carrasco, 2016; Fiebelkorn & Kastner, 2019; Landau & Fries, 2012), even when sustained attention is promoted at the cued location (Fiebelkorn, Saalmann, & Kastner, 2013). In these studies, a visual stimulus is first presented as a time reference, by which the attentional cycle could be reset and aligned across each trial (VanRullen, 2016). Importantly, the SOA between the first stimulus (a spatial cue) and the subsequent target is manipulated with a fine temporal resolution (e.g., 50 Hz given the SOA varying in steps of 20 ms) such that the behavioral performance at multiple phases within an attentional cycle could be probed (Fiebelkorn et al., 2013; Landau & Fries, 2012; Song, Meng, Chen, Zhou, & Luo, 2014). For instance, in a modified exogenous spatial cueing paradigm with time-resolved SOAs, Landau and Fries (2012) showed that the improved and reduced detection accuracy of the target at the cued location (relative to the uncued location) alternated in a frequency of 4 Hz (theta-band).

Behavioral rhythms during deployment of spatial attention have been linked to neural oscillations. For instance, electrophysiological studies reported a link between either theta (Busch, Dubois, & Vanrullen, 2009; Fiebelkorn, Pinsk, & Kastner, 2018; Hanslmayr, Volberg, Wimber, Dalal, & Greenlee, 2013; Helfrich et al., 2018; ten Oever & Sack, 2015; VanRullen, Busch, Drewes, & Dubois, 2011) or alpha phase (Busch et al., 2009; Dugué, Marque, & VanRullen, 2011; Fiebelkorn et al., 2018; Fiebelkorn, Pinsk, & Kastner, 2019; Harris, Dux, & Mattingley, 2018; Jensen, Gips, Bergmann, & Bonnefond, 2014; Sherman, Kanai, Seth, & Vanrullen, 2016) and the behavioral detection performance. Nonhuman primate studies showed that these neural oscillations are mainly distributed in the visual cortex (V1–V4; Kienitz et al., 2018; Spyropoulos, Bosman, & Fries, 2018) or the frontoparietal attention network, including frontal eye filed (FEF) and lateral intraparietal area (LIP; Fiebelkorn et al., 2018, 2019). When the phase of neural oscillation is reset by a spatial cue (Helfrich & Knight, 2016), the behavioral performances at different time points after the onset of the spatial cue would fluctuate as the phase of neural oscillation changes. The oscillation components in the time course of behavioral performance may represent similar cognitive processes as neural oscillations. With a spatial cueing paradigm, Song et al. (2014) found that in contrast to the cue invalid condition, pulsed alpha inhibition (lower alpha power; i.e., lower amplitude of RT fluctuation with alpha band frequency) was found in the RT time course for the cue valid condition, which itself fluctuated with a theta frequency (2–5 Hz). The authors suggested that decreases in alpha power in the time course of RTs represent enhanced spatial attention, consistent with electrophysiological studies which showed that higher alpha-band activity is associated with the suppression of sensory processing (van Diepen, Foxe, & Mazaheri, 2019; Händel, Haarmeier, & Jensen, 2011; Helfrich, Huang, Wilson, & Knight, 2017; Kizuk & Mathewson, 2017; Klimesch, Sauseng, & Hanslmayr, 2007; Marshall et al., 2018; Thut, 2006; Worden, Foxe, Wang, & Simpson, 2000). The fluctuation of the alpha power in the RT time course may represent periodical attention sampling of the environment.

Attentional orienting, demonstrated by the early facilitation in the classical spatial cueing paradigm, has been assumed to be stimulus driven and reflexive, whereas attentional reorienting, demonstrated by the late inhibition, has been assumed to be goal directed and voluntary (Lee & Shomstein, 2013; Shomstein & Johnson, 2013). As an extension of the traditional dichotomy of stimulus-driven (bottom-up) versus goal-directed (top-down) attention, mounting evidence has shown a critical role of reward in modulating spatial attention (Awh, Belopolsky, & Theeuwes, 2012; Chelazzi, Perlato, Santandrea, & Della Libera, 2013), in the way that a stimulus that is associated with reward attracts attention more than a stimulus that is not associated with reward (or high-reward vs. low-reward) even when this reward-associated stimulus is not the current task goal and/or detrimental (Anderson, Laurent, & Yantis, 2011; Failing & Theeuwes, 2017; Wang, Yu, & Zhou, 2013). By associating different levels of reward with the exogenous cue in the classical spatial cueing paradigm, researchers (Bucker & Theeuwes, 2014; Lee & Shomstein, 2013) have found that the early facilitatory effect (e.g., RT difference between cued and uncued targets at short SOAs) is barely affected by the level of reward, whereas the late inhibitory effect (e.g., RT difference at long SOAs) is larger following a high-reward cue than following a low-reward cue. This observation leads to the suggestion that the early attentional orienting to the cued location is too reflexive to be modulated by reward, whereas the late attentional reorienting to the cued location is governed by the top-down process that is susceptible to the motivational state induced by the reward-associated cue (Bucker & Theeuwes, 2014).

It should be noted, however, that the behavioral performance in the above studies manipulating the level of reward was sampled with a very low frequency, as the cue–target interval was varied discretely and sparsely. It remains largely unknown how the rhythmic characteristic of spatial attention with higher frequencies is modulated by reward. To address this issue, we associated high-reward or low-reward with the spatial cue and included 46 levels of cue–target SOA, ranging from 200 ms to 1,100 ms in steps of 20 ms. By investigating how RTs to the target at different SOAs was modulated by reward, we were able to show the modulatory effect of reward on the characteristics of attentional cycles; these cycles were reset by the spatial cue, which was also predictive of the level of reward for the current trial. We focused on two aspects in data analysis. First, low-frequency (0–2 Hz) RT time courses were filtered, and the reward modulation based on this low-frequency data was assessed. This was to replicate the findings in previous studies where responses to the target were sampled discretely and sparsely. Second, time-frequency analysis was applied to the RT time courses. With this analysis, spectrotemporal changes in RT time courses were examined. This was to examine whether the relative alpha power between the cue valid and cue invalid conditions was periodically changed (Song et al., 2014) and how the alpha power difference was modulated by reward.

Material and method

Participants

Twenty-two university students participated in the experiment (nine males, all right-handed, 18–23 years of age, mean = 20.4 years). All participants had normal or corrected-to-normal visual acuity, and none of them reported color-blindness or weakness. Participants received 50 RMB (about US$7) for participation and could earn extra 0–20 RMB, depending on their performance in the task. This study was performed in accordance with the Declaration of Helsinki and was approved by the Ethics Committee of the School of Psychological and Cognitive Sciences, Peking University.

Design and procedure

Participants sat in front of a CRT monitor (refresh rate = 100 Hz) in a dimly lit room, with their heads stabilized in a chin rest. The eye-to-monitor distance was fixed at 70 cm. Responses were obtained through a standard keyboard by pressing “F” and “‘J” keys. Three placeholders with black frames (each placeholder with a visual angle of 2° × 2°) were presented against a white screen throughout each trial. The boxes were localized side by side, with equal distances between the adjacent two boxes (4° between the centers of the boxes). Participants were required to maintain eye fixation on the central box without making eye movements.

In each trial, after a varied interval of 800–1,200 ms, a cue was presented inside either the left or right box for 100 ms. This cue was either a red or green square that filled in the box. The color of the cue signaled a potentially high or low reward, with the association between the color (red vs. green) and the reward level (high vs. low) counterbalanced across individuals. After another varied interval of 100–1,000 ms, a target letter “X” or “O” (1.3° × 1.3°) was presented inside the left or right box for 150 ms. That is, the stimulus onset asynchronies (SOAs) between the cue and the target were varied from 200 ms to 1,100 ms. The target was presented in the same box as the cue (cue valid condition) or the opposite box of the cue (cue invalid condition) with equal probability. Thus, the location of the cue was uninformative of the location of the target. Participants were asked to identify the target by pressing “F” or “J” on the keyboard with the left and the right index finger, respectively. The mapping between the letter identity (“X” vs. “O”) and the button (“F” vs. “J”) was counterbalanced across individuals. The time window allowing for button press was from the onset of the target to 2,000 ms after the offset of the target. The intertrial interval was a blank screen with a varied duration of 800–1,200 ms. The procedure was shown in Fig. 1. Participants were explicitly informed the association between the color of the cue and the potential reward level, and that reward could be obtained only when a correct and fast (RTs <800 ms) response was given. In a high-reward trial, a correct and fast response would result in the gain of 10 points; in a low-reward trial, a correct and fast response would result in the gain of 1 point. In trials where the response was incorrect or slow, no reward would be obtained. Participants were informed that the points gained in each trial would be accumulated during the experiment; at the end of the experiment, the total points would be proportionally exchanged to money in addition to the basic payment.

Fig. 1
figure 1

Experimental procedure. Participants maintain fixation at the central box and covertly attend to two peripheral boxes. After a varying interval of 200–1,100 ms after the cue onset, a target (X or O) was presented for 150 ms within the cued (valid) box or the uncued (invalid) box. The trial was terminated when a response was given or the time limit (i.e., 2,000 ms) was reached. The cue was a red or green square. The colors of the cue indicated a high or low-reward condition. Participants were asked to identify the target with a button press. The response (RT and accuracy) was recorded. (Color figure online)

The crucial manipulation was the fine temporal assessment of the behavioral performance on target discrimination (Dugué et al., 2016; Landau & Fries, 2012). To achieve this temporal resolution, the SOA between the cue and the target in a trial was chosen from one of 46 values from 200 ms to 1,100 ms in steps of 20 ms after cue onset, corresponding to a sampling rate of 50 Hz. There were 440 trials for each of the four experiment conditions: high-reward, valid; high-reward, invalid; low-reward, valid; low-reward, invalid. In each condition, the number of trials with the SOA of 200 ms was 10 times (i.e., 80 trials) more than the number of trials with longer SOAs (i.e., eight trials for each of the other 45 SOAs), to achieve a more prominent effect of cue resetting. This was carried out in accordance with a previous study (Fiebelkorn et al., 2011), which showed that the deployment of anticipatory attention to the cue (high probability of target appearance immediately after the cue) could enhance the cue resetting effect (more prominent behavioral oscillation effect than equiprobable target appearance). The four conditions were pseudorandomly distributed in 1,760 trials, and were then divided into 20 blocks with equal length. At the end of each block, the accumulated points thus far were presented on the screen. There were self-paced breaks between blocks. The trial sequences were different for different participants.

Data analyses

Behavioral data were analyzed using MATLAB, in conjunction with the EEGLAB toolbox and wavelet toolbox. For each participant, omissions, trials with RTs lower than 200 ms, and trials with incorrect responses were first excluded. Trials with RTs beyond four standard deviations in each of the four conditions (high-reward, valid; high-reward invalid; low-reward, valid; low-reward, invalid) were also excluded. For each participant, RTs from the remaining trials were then normalized across all conditions (i.e., Z-scored); this was to control the variance between individuals in motor responses. Note, for each participant, after within-participant normalization, the relationship between RTs of the trials was kept intact, although RTs were normalized close to zero.

As shown in Supplementary Fig. S1, the RT distributions were not typically normal but skewed with long tails. The mean and standard deviation might not be the optimal measures of the center and dispersion of the RT distributions. To validate our findings, we also log transformed the original, untrimmed RTs. The averaged log-transformed RTs across SOAs for each participant are shown in Supplementary Fig. S6. The same procedures of data analyses, including outlier detection and within-participant normalization, were conducted on the log-transformed RTs. Essentially the same pattern of results as the pattern for the nontransformed RTs was observed (see Supplementary Figs. S2S4), demonstrating that our findings were stable and could not be simply driven by long, skewed RTs. To simplify the report of results and to remain constant with the way of data analyses in previous studies, here in the main text we report only the results based on the untransformed data. Analyses of log transformed data are reported in the Supplementary Materials.

Filtering analysis

To confirm the existence of the classic spatial attention effect, the RT temporal profiles were filtered (MATLAB, EEGLAB toolbox, two-pass least-squares FIR filtering, 10th-order,) in each condition by a 0–2 Hz band-pass filter for each participant. The resulting data were equivalent to behavioral performance with sparse SOAs between the cue and the target in previous studies (Song et al., 2014). To validate the results of low pass filtering, we also smoothed the RT temporal profiles in each of the four conditions (high-reward, valid; high-reward invalid; low-reward, valid; low-reward, invalid) for each participant, within which 10 adjacent (SOA) data points were averaged.

A 2 (reward: high vs. low) × 2 (cue validity: valid vs. invalid) × 46 (SOAs: 200–1,100 ms) repeated-measures of analysis of variance (ANOVA) was conducted on the normalized RTs obtained after band-pass filtering of 0–2 Hz. Given a large amount of SOAs and the interaction between cue validity and SOA (see Results section), the simple cue validity effects were examined by paired t tests for each point of SOAs, and the significance was corrected by cluster-based permutation (Maris & Oostenveld, 2007). Specifically, the point-to-point threshold was set to p < .05 (two-tailed), and the continual time points (n > 1) that reached significance were grouped as a cluster. Within each cluster, the t values for all time points were summed into a T value of the whole cluster (Tcluster). Then the time series of the two conditions were shuffled, and the point-to-point t test was conducted for the shuffled time series. The summed T value of the biggest cluster (Tper) based on each permutation was obtained. This permutation was repeated 5,000 times, resulting in a set of Tper values (i.e., Tpers). The cluster-level statistical significance was tested by calculating the probability of Tcluster in the distribution of Tpers. The Tcluster was identified as significant with a p < .05.

Time-frequency analysis

The main purpose of our study was to investigate the modulatory effect of reward on the rhythmic sampling of spatial attention. The slow (0–2 Hz) trend signals were subtracted from the corresponding RT time courses for each condition to exclude any classic attentional cueing effect. Then the detrended RT temporal profile for each condition and each participant was transformed with continuous complex Gaussian wavelet transforms (MATLAB, wavelet toolbox, order = 4), with frequency from 1 Hz to 25 Hz in steps of 1 Hz. The RT time-frequency powers were extracted from the outcome of the wavelet transforms. This time-frequency analysis was performed for each condition and each participant. For each participant, the difference of RT power profile between the valid and invalid conditions in each reward condition was calculated. The grand mean of time-frequency powers was averaged across participants.

To assess the statistical significance of the difference between the power profiles for the valid and invalid conditions, we performed permutation procedure by shuffling the time course across the two conditions for each participant and each reward condition. For each shuffling, the time-frequency analysis procedure was performed on the shuffled signals, in the same way as the procedure performed on the original signals, and the difference of RT power profiles between the valid and invalid conditions was recalculated. The whole procedure was performed 1,000 times, resulting in a distribution of the valid–invalid power difference at each time-frequency point, from which the uncorrected p < .05 threshold was obtained. For multiple comparison correction, the following two methods were applied to the uncorrected threshold time-frequency map: within-frequency correction and between-frequency correction. For with-frequency correction, the maximum or minimum threshold values across all time bins were set as the threshold for the frequency. For between-frequency correction, the maximum or minimum threshold values across all time bins and all frequencies were set as the threshold for the whole map. The same procedure was also performed on the difference between power profiles in the high-reward and low-reward conditions.

FFT analysis and cross-correlation analysis

To investigate the periodic nature of the alpha power time courses (i.e., alpha powers as a function of cue-to-target SOAs) and how the incentive reward would modulate the relative alpha power enhancement/inhibition, 8–12 Hz (classic alpha band) power time-course profiles were extracted from the output of complex Gaussian wavelet transforms of the RT temporal profiles for each participant and each condition and averaged across the frequencies (8–12 Hz). The ground mean of the alpha power was calculated across participants for each condition and was used for FFT analysis (MATLAB, fft function). The grand mean of the alpha power difference between the valid and the invalid conditions was also calculated across participants for each reward condition and was used for FFT analysis (MATLAB, fft function) and cross-correlation analysis (MATLAB, xcorr function; Adhikari, Sigurdsson, Topiwala, & Gordon, 2010; Bolkan et al., 2017).

FFT analysis

The grand averages of the alpha power for each of the four conditions (high-reward, valid; high-reward invalid; low-reward, valid; low-reward, invalid), and the alpha power difference between the valid and the invalid conditions, averaged across participants for each reward condition, were transformed to frequency domain using fast Fourier transform. For statistical analysis, the time course of the alpha power difference was shuffled for 1,000 times for each reward condition. Each shuffled signal was retransformed to frequency domain, resulting in a distribution of power for each frequency point from which the p < .05 threshold was obtained. For multiple-comparison correction across frequencies, the maximum of the threshold across all frequencies was set as the corrected threshold.

Cross-correlation analysis

Given that the theta modulations on alpha power were observed in both high-reward and low-reward conditions, the question then was whether there was a temporal lag or a phase difference of the modulations between the high-reward and low-reward conditions. To answer this question, cross-correlation analysis (MATLAB, xcorr function) was applied to the grand average of the valid–invalid alpha power difference time courses between the high-reward and low-reward conditions across participants. The valid–invalid difference of alpha power profile in the low-reward condition could show a largest positive correlation with the alpha power difference in the high-reward condition after shifting a specific number of time points (lags). To assess the statistical significance of the lags, a permutation procedure was performed by shuffling the time course across the valid and the invalid conditions for each participant and each reward condition, in the same way as for the time-frequency analysis. For each shuffling, the time-frequency analysis and cross-correlation analysis were performed on the shuffled signals. The whole procedure was performed 1,000 times, resulting in a distribution of correlation values of the valid–invalid difference of alpha power profile between high-reward and low-reward conditions at each lag point, from which the p < .05 threshold was obtained. For multiple-comparison correction, the maximum threshold across all time lag points was set as the corrected threshold. The 95% confidence interval of correlation coefficients was computed by bootstrapping the alpha power difference in high-reward and low-reward conditions for 1,000 times.

To measure the variation of the correlation coefficients across participants, the jackknife method (Miller, 1974) was used to estimate the standard error of the mean (SEM).

$$ {\mathrm{SEM}}_{\mathrm{jack}}=\sqrt{\frac{N-1}{N}{\sum}_{i=1}^N{\left({C}_{-i}-\overline{c}\ \right)}^2} $$

C-i was the correlation coefficient obtained with cross-correlation analysis computed from the subsample including all participants except for participant i. To obtain each C-i, the grand average of alpha power difference between valid and invalid conditions was computed for each reward condition, averaging across all participants except for participant i. Then we applied cross-correlation analysis to the averaged alpha power differences. The \( \overline{c} \)was the mean of the correlation coefficient obtained in a subsample.

Phase coherence analysis

To examine whether the lag difference was due to phase difference, phase coherence analysis was conducted on the low-theta (2–3 Hz) phase relationship for the alpha power difference time course (valid vs. invalid) for the high-reward and low-reward conditions. The time courses of alpha power difference were transformed using fast Fourier transform (MATLAB, fft function) for each participant. The low-theta (2–3 Hz) phase difference between high reward and low reward was then calculated (CircStat’s toolbox) for each participant. The nonuniformity of the phase differences was tested using Rayleigh test (CircStat toolbox; Berens, 2009) across participants. Phase coherence analysis was also applied to the alpha power time course for each of the conditions (high-reward, valid; high-reward invalid; low-reward, valid; low-reward, invalid). The 2–3 Hz phase relationship between the valid and invalid conditions was examined for each of the reward conditions.

To further validate the results of cross-correlation analysis, we used time-frequency analysis to show the time course of fluctuated alpha power. The alpha power difference time courses were transformed using continuous complex Gaussian wavelet transforms (MATLAB, wavelet toolbox, order = 4), with frequency from 1 to 25 Hz in steps of 1 Hz. The power profiles as a function of time and frequency were extracted from the outcome of the wavelet transforms. The grand average time-frequency power was calculated across participants for each of the reward conditions. The time-frequency power difference between the high-reward and low-reward conditions was also calculated. Group-level permutation test (the number of iterations = 1,000) was conducted on the time-frequency power difference.

Given the periodically fluctuated alpha power difference, a further prediction is that the correlation coefficients would also fluctuate as a function of different temporal lags with a low-theta frequency. To test this prediction, we applied FFT analysis (MATLAB, fft function) to the demeaned correlation coefficients. Statistical significance was determined by permutation test (permutating the time points of each coefficient, iteration number = 1,000, significant level p < .05). Multiple comparisons between frequencies were corrected.

Power-phase locking analysis

To investigate whether there was a stable cross-frequency coupling in the temporal profile of RTs (Fiebelkorn et al., 2018; Song et al., 2014), we performed normalized power-phase locking analysis (Cohen, 2014) on the nondetrended RT time courses. Because we were most interested in the relationship between the power of high frequency and the phase of low frequency, to avoid any distortion of the phase of low frequency, we used the nondetrended RT time courses instead of the detrended RT time courses for the power-phase locking analysis. The RT temporal courses were zeros padded (50 zeros point before and after the RT temporal courses) and multiplied separately by a Hanning window for each condition and each participant. We band-pass filtered (MATLAB, EEGLAB toolbox, two-pass least-squares FIR filtering, 200 orders) the zero-padded RT temporal courses with ±2 Hz of center frequencies from 4 to 19 Hz in steps of 1 Hz and extracted the power time courses from the Hilbert transforms of the band-pass filtered signals for each condition and each participant. We also calculated the phase time courses by band-pass filtering (MATLAB, EEGLAB toolbox, two-pass least-squares FIR filtering, 200 orders) the same temporal profiles with ±1Hz of center frequency from 1 Hz to 16 Hz in steps of 1 Hz and extracted the phase time course from the Hilbert transform of the band-filtered signals for each condition and subject. The time courses of power and phase were used for power-phase locking analysis.

To assess the power-phase-locking value, a vector of the power-phase time course (power as the length, phase as the angle) was constructed for each phase frequency (1–16 Hz, in steps of 1 Hz) and power frequency (4–19 Hz, in steps of 1 Hz). The mean vector was calculated throughout all time points and all the conditions for each participant. The length of the mean vector was defined as power-phase locking value (PPLV; Canolty et al., 2006) for specific phase and power. To avoid the influence of the nonuniform distribution of phase angles and large power fluctuations that could be outliers, we performed nonparametric permutation analysis. The power time course was shuffled, and the power-phase locking value was recalculated 200 times, resulting in a distribution for each pair of phase-power frequency. The observed PPLV was compared with the distribution of PPLVs under the null hypothesis by subtracting the mean and divided by the standard deviation. This created a normalized Z value of PPLV (PPLVz) for each participant and each pair of frequencies. For group-level statistical analysis, a one-sample t test was applied for PPLVz > 0. False discovery rate (FDR) correction was applied for multiple comparisons (Benjamini & Hochberg, 1995).

Results

The overall response accuracy was high, with correct percentage mean (± SEM) equaling to 97.27 (± 0.39). Only a small number of trials (percentage mean ± SEM: high-reward valid, 1.33 ± 0.36; high-reward invalid, 2.26 ± 0.89; low-reward valid, 1.47 ± 0.40; low-reward invalid, 2.15 ± 0.85) were discarded as outliers. The distributions of the raw RTs are shown in Supplemental Fig. S1. Below, we focused on the analyses of RTs.

Reward modulation on RT time courses at low-frequency (0–2 Hz)

The 2 × 2 × 46 ANOVA on the low-frequency, 0–2 Hz band-passed RTs showed a main effect of reward (low-frequency RTs: high reward, −0.034; low reward, 0.003), F(1, 21) = 10.89, p = .003, ηp2 = 0.34, a main effect of SOA, F(45, 945) = 6.98, p < .001, ηp2 = 0.250, but no main effect of cue validity (low-frequency RTs: valid, 0.008; invalid, −0.039), F(1, 21) = 4.25, p = .052, ηp2 = 0.168. Importantly, the interaction between reward and cue validity, F(1, 21) = 6.10, p = .022, ηp2 = 0.225, the interaction between SOA and reward, F(45, 945) = 1.57, p = .010, ηp2 = 0.070, and the interaction between SOA and cue validity, F(45, 945) = 11.34, p < 0.001, ηp2 = 0.351, were all significant. The three-way interaction between cue validity, SOA and reward did not reach significance, F(45, 945) < 1. The smoothed (averaging over 10 adjacent SOA data points) data showed the same pattern as the lowpass (0–2 Hz) filtered data (the bottom of Fig. 2). We focused on the lowpass (0–2 Hz) filtered data in the following analyses.

Fig. 2
figure 2

Normalized RT time courses (normalized within participants across all trials) as a function of cue-to-target SOA (n = 22). For easier illustration, the SOAs are rewritten in seconds. Grand average of RT time courses for high-reward (top left, mean ± SEM) and low-reward (top right, mean ± SEM) conditions, grand average of 0–2 Hz low-pass filtered RT time courses for high-reward (middle left, mean ± SEM) and low-reward conditions (middle right, mean ± SEM), and grand average of smoothed RT time course (averaged span: 10 points) for high-reward (bottom left, mean ± SEM) and low-reward conditions (bottom right, mean ± SEM) are also presented

Based on the interaction between cue validity and SOA, further analyses showed that responses to the target were faster at the valid location than at the invalid location (i.e., a facilitatory effect) when the SOA was 200 ms to 280 ms (paired t test, p < .05, cluster-based permutation corrected, 5 points, cluster-level p < .001), whereas responses were clearly slower at the valid location than at the invalid location (i.e., an inhibitory of return effect) when the SOA was 480 ms to 1,080 ms (paired t test, p < .05, cluster-based permutation corrected, 31 points, cluster-level p < .001). The pattern of low-frequency RTs replicated the classical findings of early facilitation and late IOR, repeatedly shown in previous studies (Klein, 2000; Tipper & Kingstone, 2005).

To investigate how the early facilitation and late IOR were modulated by reward, we calculated the RT differences between the invalid and valid conditions, averaging over the short SOAs (200–280 ms) and the long SOAs (480–1,080 ms). Then, we conducted a 2 (reward: high vs. low) × 2 (SOA: short vs. long) ANOVA on the RT differences. Note that we only reported the interaction and the pattern of the simple effects to avoid circular analysis of the main effects. The ANOVA showed a significant interaction between reward and SOA, F(1, 21) = 8.69, p = .008, ηp2 = 0.29. Paired t tests on simple effects showed that the RT differences in the short SOAs (i.e., the early facilitatory effect) did not differ between high-reward and low-reward conditions, t(1, 21) = −1.46, p = .166, whereas the RT differences in the long SOAs (i.e., IOR effect) was larger in the high-reward condition (0.104) than in the low-reward condition (0.058), t(1, 21) = 2.50, p = .018. Further analyses showed that for the long SOAs, the low-pass RTs at the uncued position was shorter in the high-reward condition (−0.106) than in the low-reward condition (−0.045), t(1, 21) = −4.81, p < .001, while the low-pass RTs at the cued position was not influenced by the level of reward (lowpass RTs: high reward, −0.003; low reward, 0.013), t(1, 21) = −0.89, p = .383. The same pattern was observed on the raw (i.e., unnormalized) RTs (see Table 1).

Table 1 Mean reaction times (ms) and stand deviations across participants as a function of cue validity and cue-to-target SOAs for the high-reward and low-reward conditions

Periodic alpha power inhibition in the cue-valid condition relative to the cue-invalid condition

After being subtracted (0–2 Hz filtered) the slow-trend signals, the remaining RT time courses (see Fig. 3a) were analyzed using time-frequency analysis (see Material and Method section). The alpha powers for each of the four conditions (i.e., high-reward, valid; high-reward invalid; low-reward, valid; low-reward, invalid) showed periodically changing patterns (see Fig. 3b; similar patterns were observed on the raw RTs; see Supplemental Fig. S4). Further analyses showed that the alpha (8–12 Hz) power for each of the four conditions fluctuated in a delta/low-theta frequency (2–3 Hz; see Fig. 4, middle, FFT analysis, p < .05 across frequency corrected). For the low-reward condition, the low-theta (2–3 Hz) phases of the alpha power between the valid and invalid conditions showed a fixed relationship (see Fig. 4, bottom right, Rayleigh test, n = 22, p = .071), with the phase differences (valid vs. invalid) across participants clustered around a mean of 126°. Such a relationship was not found in the high-reward condition (see Fig. 4, bottom right, Rayleigh test, n = 22, p = .401).

Fig. 3
figure 3

Detrended RT time courses and time-frequency power profiles. Detrended RT time courses and time-frequency power profiles as a function of SOA (200–1,100 ms) and frequency (1–25 Hz). For easy illustration, the SOAs are presented in seconds. a Grand average (n = 22) of detrended RT time courses for high-reward (left) and low-reward (right) conditions. The shadows denote ±1 SEM. b Grand average of (n = 22) time-frequency power for valid (left) and invalid (right) conditions when the spatial cue associated with a high reward (top) or a low reward (bottom). (Color figure online)

Fig. 4
figure 4

Spectrum amplitude and 2–3 Hz phase relationship of alpha power time courses. For easy illustration, the SOAs are presented in seconds. Top: Alpha power time course. The alpha powers are shown as a function of cue-to-target SOAs for each condition. The shadows denote ±1 SEM. Middle: Spectrum amplitudes of alpha power time courses. The spectrum amplitudes of alpha power time course are shown as a function of frequencies for the high-reward (left) and low-reward (right) conditions, separately. The dashed lines indicate the statistically significant threshold (p < .05) based on permutation tests (corrected across frequencies). Bottom: The 2–3 Hz phase relationship between the valid and invalid conditions. The 2–3 Hz frequency phase difference of alpha power time course between the valid and invalid conditions across participants for the high-reward (left) and low-reward conditions (right) are showed. The green bars indicate the mean phase difference across participants, and the length of the green bar indicates the phase difference coherence value across participants. # indicates a marginal significant effect (p < .10). (Color figure online)

For both the high-reward and low-reward conditions, the power response profiles showed a stronger alpha pattern in the invalid condition than in the valid condition (permutation test, corrected p < .05; see Fig. 5a). FFT analysis on the difference of alpha power between the valid and invalid conditions also showed significant low-theta (2–3 Hz) band fluctuation in both the high-reward and low-reward conditions (permutation test, corrected p < .05; see Fig. 5c). These results suggested that after attention on the two peripheral boxes had been reset by the cue, the RT time courses at the cued location underwent pulsed alpha-band fluctuations relative to those at the uncued location in a delta/low-theta (2–3 Hz) rhythm for both reward conditions.

Fig. 5
figure 5

Time-frequency power difference and the spectrum of the alpha power difference between the valid and invalid conditions. For easy illustration, the SOAs are presented in seconds. a Time-frequency power difference. Top: Grand average (n = 22) time-frequency maps for valid–invalid power difference for the high-reward (left) and low-reward (right) conditions. Bottom: Valid–invalid power difference time-frequency maps thresholded by permutation test. *p < .05 (uncorrected). **p < .05 (within-frequency multiple-comparison correction). ***p < .05 (across-frequency multiple-comparison correction). Red represents positive valid–invalid power difference values; blue represents negative power difference values. b Time courses of the alpha power difference. Grand average of the alpha powers difference (n = 22) between the valid and invalid conditions (averaged between 8 and 12 Hz) as a function of cue-to-target SOAs. The shadows denote ±1 SEM. c The spectrum of the alpha power difference. FFT results of the grand average of the alpha power difference between the valid and invalid conditions for the high-reward and low-reward conditions. The dashed lines indicate the statistically significant threshold (p < .05) based on permutation tests (corrected across frequencies). (Color figure online)

We also investigated the time-frequency power difference between the high-reward and low-reward conditions for the valid and invalid conditions separately. For the valid condition, stronger alpha power was observed in the low-reward condition than the high-reward condition during the short cue-to-target SOAs (200–300 ms), while stronger theta power was observed in the high-reward condition than the low-reward condition during the long cue-to-target SOAs (700–1,100 ms) (permutation test, corrected p < .05; see Fig. 6, left). For the invalid condition, stronger alpha power was observed in the low-reward condition than the high-reward condition (permutation test, corrected p < .05; Fig. 6, right), which showed a periodically changing pattern.

Fig. 6
figure 6

Time-frequency power difference between the high-reward and low-reward conditions. Top: Grand average (n = 22) time-frequency maps for high versus low power difference in the valid (left) and invalid (right) conditions. Bottom: High–low power difference time-frequency maps thresholded by permutation test. *p < .05 (uncorrected). **p < .05 (within-frequency multiple-comparison correction). ***p < .05 (across-frequency multiple-comparison correction). Red represents positive high versus low power difference values; blue represents negative power difference values. (Color figure online)

Periodic alpha pulses emerged earlier under higher reward

The correlation coefficients of alpha power (valid–invalid) time courses between the high-reward and low-reward conditions are shown in Fig. 7a as a function of shifted lags. The results showed that after approximate 120 ms or 420 ms forward shifting, the profiles of alpha power (valid–invalid) in the low-reward condition showed the largest positive correlations with that in the high-reward condition (permutation test, p < .05, corrected; see Fig. 7a). These results suggested that after forward shifting of 120 ms or 420 ms, the alpha power profile in the low-reward condition was most similar to the alpha power profile in the high-reward condition. Further analysis showed that the significant correlation after shifting alpha power profiles was not driven by extreme values from a single participant (jackknife method; see Fig. 7a, right). This finding suggested that the fluctuating alpha power pattern emerged 120-ms earlier in the high-reward condition than that in the low-reward conditions. An alternative explanation is that the observed pattern was due to a phase difference. However, the low-theta phase difference between the high-reward and low-reward conditions was not observed in alpha power (valid–invalid) time courses (see Fig. 7c, Rayleigh test, n = 22, p = .365). Furthermore, consistent with the fluctuation of alpha power (valid–invalid) time courses with a low-theta band frequency, the correlation coefficients of alpha power (valid–invalid) time courses between the high-reward condition and low-reward condition also showed periodic fluctuation with a low-theta band frequency (3 Hz, Fig. 7B, FFT analysis, permutation test, p < .05, cross-frequency corrected).

Fig. 7
figure 7

Results of correlation coefficients. a Correlation coefficients with different shifted lags. Correlation coefficients of alpha power profiles (valid–invalid) as a function of shifted lags between the high-reward and low-reward conditions are shown. For easy illustration, the shifted lags are presented in seconds. Left: Correlation coefficients for the grand average of alpha power difference (n = 22); the gray horizontal dotted line shows the critical correlation coefficient corresponding to the corrected p = .05 (permutation test); the shadows denote 95% confidence interval estimated with bootstrapping method. Right: Correlation coefficients estimated with the jackknife method; the shaded area represents ±1 SEM. b Spectrum amplitude of correlation coefficients. Spectrum amplitude of the correlation coefficients is shown as a function of shifted lags (seconds). The black dashed lines indicate the statistically significant threshold (p < .05) for the permutation test; the red dashed line indicates the corrected threshold across frequencies. c The 2–3 Hz phase relationship of the alpha power difference. The 2–3 Hz frequency phase difference (high reward vs. low reward) distribution of alpha power (valid–invalid) profiles are shown across participants. The green bars indicate the mean phase difference across participants, and the length of the green bar indicates the phase difference coherence value across participants. (Color figure online)

Results of the time-frequency analysis on the alpha power profiles (valid–invalid) showed that the alpha fluctuations manifested with a low-theta frequency (see Fig. 8a). The low-theta (2–4 Hz) power of alpha power difference between the valid and invalid conditions was larger in the high-reward than in the low-reward conditions during the SOA of 200–400 ms, and this pattern was reversed during the SOA of 500–700 ms (see Fig. 8b). These findings suggested that the theta modulations on the alpha power difference emerged earlier for the high-reward condition than for the low-reward condition, which was consistent with the results of cross-correlation analysis.

Fig. 8
figure 8

a Grand average (n = 22) of time-frequency power for valid–invalid alpha power time courses. For illustration, the SOAs are presented in seconds. Left: Grand average power for high-reward conditions. Right: Grand average power for low-reward conditions. The alpha power pattern emerged earlier with a high-reward condition than with a low-reward condition. b Time-frequency power difference (n = 22) of alpha power profiles between the high-reward and low-reward conditions. Left: Grand average power difference between high-reward and low-reward conditions. Right: Grand average power difference between high-reward and low-reward conditions thresholded by permutation test. *p < .001 (uncorrected). **p < .001 (within-frequency multiple-comparison correction). ***p < .001 (cross-frequency multiple-comparison correction). c Alpha power (8–12 Hz) and low-theta (1–3 Hz) relationship. Grand mean of cross-frequency power-phase locking value (n = 22) is shown. Dotted line indicates the frequencies showed significant power-phase locking; one-sample t test, FDR corrected p < .05. (Color figure online)

To explore the relationship between different frequency components, power-phase locking analysis was applied to the RT time courses across both reward and cue validity conditions. Results showed that the alpha power was phase-locked to the phase of a delta/low theta (1–3 Hz; one-sample t test, n = 22, FDR corrected; see Fig. 8c). The power-phase locking relationship showed no significant difference between the high-reward and low-reward conditions (paired t test, n = 22, FDR corrected).

Discussion

In this study, using the classical spatial cuing paradigm and with a dense distribution of SOAs between the cue and the target, we investigated how the rhythmic characteristic of spatial attention is affected by reward. The low-frequency RT time courses showed the classic pattern of early facilitatory effect (SOAs: 200–280 ms, RTs: invalid > valid) and late IOR effect (SOAs: 480–1,080 ms, RTs: valid > invalid) for cue validity. The early facilitatory effect was not affected by the level of reward. The IOR effect, however, was enhanced when the spatial cue was associated with a high reward; this enhancement came mainly from the facilitated responses to the target at the uncued location. After the low-frequency signals were subtracted, a recurring alpha inhibition was found at the cued location (relative to the uncued location) in both the high-reward and low-reward conditions; this alpha inhibition fluctuated with a frequency of 2–3 Hz. Moreover, the pattern of the recurring alpha inhibition emerged earlier (~120 ms) in the high-reward condition than that in the low-reward condition.

The early facilitation and the late IOR effect shown by the low-frequency (0–2 Hz) data replicated the classic exogenous attentional cueing effects (Chen et al., 2010; Klein, 2000; Posner, 1980; Posner & Cohen, 1984; Tipper & Kingstone, 2005). The low-pass filtering method we used corresponds to the practice of probing behavioral performance with sparsely sampled SOAs (Song et al., 2014). The observation that only responses to the target at the uncued location under long SOAs were modulated by reward was consistent with recent findings (Bucker & Theeuwes, 2014; Engelmann & Pessoa, 2007; Lee & Shomstein, 2013; Shomstein & Johnson, 2013; Small et al., 2005). Together, these results suggest that the low-frequency attention sampling at short SOAs is mainly an automatic process that is hardly affected by reward whereas the low-frequency attention sampling at long SOAs is governed by top-down processes that are susceptible to the influence of reward (Bucker & Theeuwes, 2014; Corbetta, Kincade, Ollinger, Mcavoy, & Gordon, 2000; Lee & Shomstein, 2013).

The time-frequency analysis showed lowered alpha power (i.e., alpha inhibition) at the cued location than at the uncued location. This alpha inhibition showed a pulse pattern that fluctuated in 2–3 Hz. The fluctuation of alpha inhibition underlying behavioral attention sampling is consistent with Song et al. (2014). Importantly, the current study showed that the fluctuation of alpha inhibition occurs irrespective of the level of reward indicated by the cue, suggesting a general role of pulsed alpha inhibitions in spatial attentional sampling. Many studies showed that the phase of alpha band activities in the cortex could predict perceptual performance (Busch et al., 2009; Dugué et al., 2011; Harris et al., 2018; Jensen et al., 2014; Sherman et al., 2016). The alpha pulses in the RT time courses could be underscored by the alpha oscillation in the cortex and may represent similar cognitive processes reflected by the alpha band neural activity in the cortex (Song et al., 2014). As the alpha-band activity has been linked to inhibitory functions during attentional processes in many studies (Händel et al., 2011; Helfrich et al., 2017; Kizuk & Mathewson, 2017; Klimesch et al., 2007; Marshall et al., 2018; Thut, 2006; van Diepen et al., 2019), here we suggest that the lower alpha power (alpha inhibition) at the cued location than the uncued location represents enhanced attentional sampling at the cued location. The alpha inhibition that fluctuated in a low-theta frequency may indicate that the attentional states at the cued location fluctuates in a low-theta frequency, which is consistent with many psychophysics studies showing the dominance of theta in the rhythmic nature of spatial attention (Dugué et al., 2016; Fiebelkorn et al., 2013; Huang, Chen, & Luo, 2015; Landau & Fries, 2012; Song et al., 2014). Taken together, these findings suggest that the rhythmic sampling of spatial attention may be implemented by the periodically fluctuated (2–3 Hz) alpha inhibition.

One might argue that the rhythmic behavioral performance observed here was due to microsaccades, the small involuntary fixational eye movements that have been linked to theta-band neural activities during visual perception (Bosman, Womelsdorf, Desimone, & Fries, 2009; Chen, Ignashchenkova, Thier, & Hafed, 2015). However, using paradigms similar to the present one, recent studies showed that the link between neural oscillations and behavioral performance persists even after the removal of trials with microsaccades during the cue–target delay (Fiebelkorn et al., 2018; Landau, Schreyer, Van Pelt, & Fries, 2015; Spyropoulos et al., 2018), indicating that the periodically fluctuated (2–3 Hz) alpha inhibitions in the RT time courses cannot be simply due to microsaccades.

The most important finding in the current study is that the rhythmic alpha inhibition in the high-reward condition emerges earlier (~120 ms) than the rhythmic alpha inhibition in the low-reward condition, as shown by the cross-correlation analysis. One might argue that this result simply reflects a difference in phase rather than a difference in the onset of the rhythmic alpha inhibition. To test this hypothesis, we conducted phase coherence analysis on the phase difference of the alpha power (valid–invalid) profiles between the high-reward and low-reward conditions. The results showed no phase difference between the two conditions in the low-theta band. Moreover, we also applied time-frequency analysis to the alpha power difference between valid and invalid conditions for each of the reward conditions. Results showed that the stronger low-theta modulation in the high-reward condition than in the low-reward condition occurred in an early time window (i.e., 200–400-ms SOA). Based on these analyses, we conclude that the critical finding of the cross-correlation was not caused by phase difference, but by onset difference; the fluctuated alpha power emerged earlier in the high-reward condition than that in the low-reward condition.

Power-phase locking analysis showed that across all conditions (cue validity and level of reward), the alpha (8–12 Hz) power in the time course of RTs was significantly locked to the low-theta (1–3 Hz) phase, implying that the alpha power was modulated by the theta phase or vice versa. Considering that the oscillation components in behavioral performance are probably underlain by the oscillation of neural activity in the same frequency (Harris et al., 2018; Landau et al., 2015; Slagter, Lutz, Greischar, Nieuwenhuis, & Davidson, 2009), the current results are consistent with previous electrophysiological studies (Fiebelkorn et al., 2018; Helfrich et al., 2018) showing that the rhythmically alternating attention state is reflected by alpha power (i.e., the high attentional state is reflected by an alpha inhibition) and modulated by theta phase (i.e., a high attentional state is shown in a proper theta phase and a low attentional state is shown in the opposite phase) in the frontoparietal areas. Previous studies have shown that the theta and alpha components in the frontoparietal areas could be modulated by reward (Crowley et al., 2014; Kamarajan et al., 2008; Shankman, Sarapas, & Klein, 2011; Wang et al., 2019; Yang, Jacobson, & Burwell, 2017). We propose that as the time-spectral pattern (alpha pulses modulated by theta phase) in our behavioral data is likely underlain by the neural oscillation in the frontoparietal area, and the modulatory effect of reward on the rhythmic characteristics of spatial attention is also underlain by the modulation of reward on the frontoparietal oscillation network.

It has been consistently shown that reward can affect attentional processes (Anderson, 2015; Anderson et al., 2011; Anderson et al., 2016; Anderson, Laurent, & Yantis, 2014; Wang et al., 2015; Wang, Duan, Theeuwes, & Zhou, 2014). In an extension, we showed that, at the low-frequency band, reward facilitated the attentional orienting to novel locations at long SOAs, and beyond low-frequency band, reward led to shorter latency of theta-modulated alpha in attentional sampling. Both of these processes are governed by top-down control, which is supported by neural oscillations. Previous studies have shown that neural oscillations in the frontoparietal cortex could be modulated by top-down processing (Helfrich et al., 2017; Phillips, Vinck, Everling, & Womelsdorf, 2014). Electroencephalogram studies showed that the temporal predictions of a stimulus could bias the phase of alpha band activity toward the optimal phase for visual processing (Samaha, Bauer, Cimaroli, & Postle, 2015), and the alpha activity modulated by prediction is dominated by the phase of frontal low-theta activity (Helfrich et al., 2017). The attentional processing reflected by lower alpha power in the posterior-occipital cortex is modulated by reward (Marshall et al., 2018). Moreover, it is generally believed that the theta-band neural oscillations in the middle frontal cortex are related to cognitive control (Cavanagh & Frank, 2014). Recent studies have shown that the frontal theta oscillations are modulated by reward (Kang, Chang, Wang, Wei, & Zhou, 2018; Wang et al., 2019). Taken together, these findings suggest that reward might act to motivate the cognitive control that is supported by theta activities in the frontal cortex, which in turn modulate the attentional processes that are supported by alpha activities in the posterior-occipital cortex. Indeed, we can speculate that the reward-predictive cue resets the frontal (FEF) theta phase to the optimal phase for visual processing immediately after the cue onset, which in turn reduces the alpha power in the parietal-occipital cortex. Relative to a low-reward cue, the high-reward cue facilitates phase resetting, although how the reward system (i.e., mesolimbic dopaminergic circuits) interacts with the frontal-parietal network to modulate the periodical sampling of the environment is an important issue for future research.

In summary, using a time-resolved measurement, the current study reveals that different frequency components of the RT time courses in spatial attention are modulated by reward in different ways. In the low-frequency band, reward facilitates the response to a target at the uncued location, but only at long SOAs. Beyond the low-frequency band, alpha inhibition that fluctuates in a low-theta frequency is observable at the cued location relative to the uncued location, reflecting the rhythmic characteristic of spatial attention. Importantly, this rhythmic alpha inhibition emerges earlier when the spatial cue is associated with a high reward than when the cue is associated with a low reward. These findings suggest that the rhythmic sampling of spatial attention is a general phenomenon but the onset of its occurrence could be modulated by reward.