Introduction

A fundamental, visually striking and categorical issue in relation to any language is whether or not, in its written form, it is word spaced. Most alphabetic languages are presented with inter-word spaces (e.g., English) that delineate word units and provide explicit cues as to a word’s length. However, there are some languages without inter-word spaces (e.g., Chinese, Thai, Japanese, etc). A number of important eye-movement studies that have investigated reading report highly consistent findings showing that the removal of word spacing in normally spaced languages disrupts word identification and saccadic targeting, and reduces reading efficiency more generally (e.g., Rayner et al., 1998). Specifically, when word spacing is removed, longer and more fixations are made compared to normally spaced reading. In relation to saccadic targeting, readers are more likely to orient towards the word beginning, though, it is important to note that the word centre is the likely optimal location to fixate word for efficient identification. Certainly, in the eye-movement literature on reading in normally spaced languages, saccades appear to be targeted to the word centre with increased numbers of initial fixations made towards this point (Preferred Viewing Location; Rayner, 1979).

However, whilst adding inter-word spaces to normally unspaced languages such as Chinese does not facilitate reading behaviour in Chinese adults (e.g., Bai et al., 2008), robust facilitation does occur in children and those learning Chinese as a second language (Blythe et al., 2012; Shen et al., 2012). Without question, existing research indicates that words are very important in Chinese reading (see Li et al., 2015). Thus, a theoretical question that naturally arises in respect of unspaced languages concerns how readers effectively make word demarcation decisions as they move their eyes to read unspaced languages like Chinese (Zang et al., 2011).

Wang et al. (2021) examined how exposure frequency affects learning of Landolt-C clusters (i.e., three Landolt-C rings with unique combinations of varied orientations) and subsequent scanning of Landolt-C strings under different spacing formats (i.e., spaced strings, unspaced strings, and unspaced strings with alternating shadings, or alternating in black and grey). Wang et al. obtained robust exposure frequency effects during learning, however, those effects did not carry over to the scanning session in which participants were required to detect pre-learnt clusters embedded in longer sentence-like Landolt-C strings. Furthermore, Wang and colleagues found robust effects of spacing in the scanning session such that fixations were shortest and saccades longest in the spaced condition, fixations were somewhat longer and saccades shorter in the shaded condition, with the longest fixations and shortest saccades in the unspaced condition. Also, in relation to landing position distributions, the majority of initial fixations were made close to the beginning of clusters for both unspaced and shaded unspaced conditions; by contrast, the majority of fixations landed towards the centre of clusters in spaced conditions.

One of the most noteworthy aspects of Wang et al.’s study was the absence of any influence of frequency during scanning of Landolt-C strings (this despite clear exposure frequency effects during the learning session). Wang and colleagues argued that one reason for the failure to obtain exposure frequency effects during the scanning session might have been because participants were unable to effectively detect pre-learnt targets when they were embedded in longer strings. On the assumption that target identification was a prerequisite for a frequency effect to occur, then no such effect would occur if participants were unable to identify the targets. The failure to identify the pre-learnt targets during the scanning session in the Wang et al. study was very likely due to the fact that the Landolt-C clusters were extremely difficult to maintain in memory because they were very unlike words in many respects. Indeed, Wang et al. purposefully stripped away the linguistic characteristics of the clusters that were to be learnt in order to focus on how the visual familiarity of clusters affected processing in a reading-like visual search task. To this extent then, the stimuli in Wang et al.’s study required participants to engage in relatively shallow levels of (predominantly visual) processing rather than deeper levels of linguistic processing.

To assess whether the nature of the stimuli in the study by Wang et al. significantly contributed to the difficulty of the scanning task, thereby resulting in a lack of frequency effects, in the current study, we repeated the experiment using English orthographically regular and pronounceable pseudoword stimuli. We suggest three reasons why novel pseudowords may be much easier to instantiate, represent and store in memory compared with Landolt-C clusters. First, all the elements that constitute pseudowords are letters of the alphabet for which there are already existing memory representations. Thus, there is no necessity for readers to create representations for the constituent elements of the novel strings. Second, pseudowords are pronounceable. When compared to unpronounceable Landolt-C clusters, pseudowords convey phonological information and thereby afford the possibility that readers may form a richer memory representation. Furthermore, the phonological form of a string will map directly onto existing representations of phonemes stored in memory. Third, there is reduced visual similarity between the pseudowords in the current stimulus set relative to the set of Landolt-C stimuli used by Wang et al. The reduced similarity within set of pseudoword stimuli adopted here will reduce inter-stimulus interference and therefore increase the chances that participants find them easier to remember. Taken together, these characteristics of pseudowords should ensure that the stimuli in the present experiment are more memorable than those of Wang et al., and therefore, they should provide an increased opportunity to observe frequency effects both in learning and in later during the reading-like pseudoword scanning task.

To reiterate, in the current study, we aimed to examine whether exposure frequency effects might be established during learning of pseudowords and how alternative spacing formats might affect the ease with which those pre-learnt target pseudowords are identified and saccades targeted to them in a search task during string scanning. As word frequency effects in the reading literature are considered as a temporal marker of lexical processing, the occurrence of interactive effects between word frequency and the visual format of text may indicate that the visual demarcation within text does not simply affect visual processing but might also affect aspects of linguistic processing, namely, lexical processing during reading. It is for this reason that we were also interested to examine whether or not there were interactive effects between the visual format of text and exposure frequency in the current reading-like string scanning task. We constructed the following hypotheses: First, we predicted recognition accuracy for pseudowords in both learning and scanning sessions would be increased in the present study relative to that reported by Wang et al. Second, the rate of learning (cf., Wang et al., 2021), should be faster for high-frequency (HF) triplets than for low-frequency (LF) triplets. Importantly, we predicted that a significant difference in the rate of learning for high relative to low frequency triplets should appear earlier during learning for pseudowords than it did for the Landolt-C clusters used by Wang et al. Finally, in relation to our spacing manipulations for string scanning, we predicted reading-like scanning of strings would be most difficult in unspaced conditions, less difficult in the shaded conditions and easiest in the spaced conditions. HF pseudowords should be processed faster than LF pseudowords attracting shorter and fewer fixations (i.e., main effects of exposure frequency). Previous studies of alphabetic reading have demonstrated that removal of spacing and a lack of word boundary demarcation (e.g., via shading) causes disruption to processing (e.g., Perea & Acha, 2009; Perea et al., 2015; Rayner et al., 1998). Furthermore, these studies also showed that increased disruption to processing resulted in frequency effects of increased magnitude. The more difficult the visual conditions make it to lexically identify a word, then the more pronounced the frequency effect. Thus, in line with these findings, we also predicted that spacing manipulations would have a modulatory influence on reading times such that the largest frequency effects would occur in the unspaced conditions, somewhat smaller effects in the shaded conditions and the smallest in the spaced conditions. Finally, in line with the Wang et al. study, with respect to where to target the eyes, readers should have a high likelihood of landing towards the centre of words in spaced conditions, whilst the eyes should be more likely to land towards the beginning of words in the unspaced and shaded conditions.

Method

Participants

In line with the Wang et al. Landolt-C study, we conducted a power analysis on the interactive effects between exposure frequency and spacing format using the PANGEA power application (Westfall, 2015). The result indicated that with 80% power of observing a medium effect size (Cohen’s d = 0.5), the minimum number of participants necessary for the present experiment was 24. We, therefore, tested 36 native English speakers from the University of Southampton with normal vision or corrected-to-normal vision.

Apparatus

The equipment and programmes used in the present experiment were the same as those used by Wang et al. The experiments were programmed in Experiment Builder and run on a 20-in. CRT monitor with a refresh rate of 75 Hz. Stimuli were displayed in font Calibri size 36 on a 95% white background screen with a resolution 1,280 × 1,024 pixels. Participants were seated 70 cm from the monitor, and at this viewing distance, 1° of visual angle approximated 1.3 letters. During testing, a chin and forehead rest was used to minimise participants’ head movements. We used an Eyelink 1000 eye tracker to record participants’ eye movements during testing both the learning and scanning sessions. Viewing was binocular, but only the movements of right eye were recorded.

Stimuli

We constructed 552 three-letter pseudowords that were orthographically regular and pronounceable as the stimuli. Each pseudoword contained three different letters. The consonant-vowel structure of the pseudowords could be one of four patterns: CVC (consonant-vowel-consonant), VCV (vowel-consonant-vowel), CVV (consonant-vowel-vowel) and VVC (vowel-vowel-consonant). For example, the pseudoword ruz has a CVC structure. An equal number of pseudowords in each category was generated.

Again, as per Wang et al., we selected 24 pseudoword triplets (six triplets from each category) as targets that participants were to learn in the learning session and to identify in the scanning session. In the learning session, like Wang et al., we included three learning assessments to evaluate the extent to which participants had learnt the target triplets. A unique set of 24 distractors was selected from the pseudoword triplet database for each learning assessment block and these distractors were not used in the subsequent scanning session.

The remaining 456 pseudoword triplets were used to compose longer text-like strings for the scanning session. Each string was ten triplets long, and, in total, we generated 48 frames of sentence-like strings (see Fig. 1). Half of the strings had a pre-learnt pseudoword target embedded within them, positioned equally often in the second to the eighth triplet position within each string. The 24 remaining strings contained no target. In the scanning session, we manipulated the spacing format of the strings to be spaced, unspaced or shaded. The same string frames were presented under the three spacing formats in separate blocks, therefore, 144 experimental strings in total were generated. Although pseudowords do not exist in real languages, they may vary in relation to visual familiarity. To minimise the potential influence of visual familiarity in relation to the pseudowords in our experimental stimulus set, we experimentally controlled the number of orthographic pseudoword neighbours for both targets and distractors. Here, orthographic pseudoword neighbours refers to any pseudowords within our set that share two letters in the same positions (e.g., ruz is one of the orthographic neighbours for ruc and vice versa). The mean number of orthographic neighbours was 10 and 11 for each distractor and target triplet, respectively.

Fig. 1
figure 1

Example pseudoword strings displayed under the three spacing formats: Spaced string, shaded string and unspaced string. The target triplet in this example was in the seventh triplet position, oab

Experimental design

The experimental design of the present study was identical to that of Wang et al. There was a learning session and a successive scanning session. In the learning session, we manipulated the exposure frequency of target pseudowords. Participants learnt these targets cumulatively over five learning blocks. Targets designated to be high frequency were learnt four times per learning block; whilst, targets designated to be low frequency were learnt one time per block. Therefore, after five blocks’ learning, HF targets had been learnt 20 times relative to five times for LF targets. We also included a learning assessment block after the first, the third and the fifth learning blocks to evaluate the extent to which participants had remembered the targets.

After the learning session was completed, participants were required to undertake a target detection task during the scanning of pseudowords strings. The strings were displayed under the three spacing formats. We rotated the assignment of exposure frequency of the selected target pseudowords across participants. Furthermore, we counterbalanced the sequence of running blocks according to a Latin Square design.

Procedure

Again, all aspects of the procedure mirrored those of Wang et al. During both learning and scanning sessions, eye movements were recorded. Calibration was carried out until the mean error was less than .5° for the learning session and less than .2° for the scanning session. Recalibration was carried out whenever necessary.

In the learning session, each trial started with a box appearing slightly left to the centre of the screen. Once the participant fixated the box, a pseudoword appeared in the middle of the screen, simultaneously with a square appearing on the right-hand side of the screen. The square could appear in any one of four positions at different points on the same vertical line. Participants were encouraged to try their best to remember the displayed pseudoword. In each learning trial, the time to learn a pseudoword was self-paced. Once participants felt they had learnt the pseudoword, they were required to make a saccade from the pseudoword to the square on the right of the screen, thereby terminating the trial and initiate a blank screen. The procedure in the learning assessment trials was very similar to the learning trials. However, instead of using a square on the right as a trigger to terminate each trial, in each learning assessment trial, participants pressed a button to indicate whether they felt they had, or had not, previously learnt the displayed pseudoword.

After the learning session, participants took a break before continuing with the scanning session. In each scanning trial, participants first fixated a square on the left of the screen causing a sentence-like pseudoword string to appear across the screen. Participants were instructed to scan through the pseudoword string triplet by triplet from left to right and make a decision as to whether or not the string contained a target they had learnt in the learning session. Participants pressed a button to terminate the trial and then pressed another button to provide their ‘yes/no’ response. If participants detected a target, they pressed the ‘yes’ button and then typed the detected pseudoword into the computer using a keyboard.

During both the learning session and the scanning session, participants had short breaks whenever needed. In total, the experiment took approximately 2.5 h to complete.

Results

Linear mixed-effects models (LMMs; see Bates et al., 2016) including both fixed factors and random factors (i.e., items and participants) in the structure were run in the R environment (R Core Team, 2018) for each analysis.Footnote 1 For binary data, such as accuracy, logistic generalized mixed effects models (GLMMs) were used. All p values were computed using lmertest package (Kuznetsova et al., 2017).

Separate analyses for the learning session and the scanning session were conducted. All the measures we computed here were the same as those used in Wang et al. (2021). In the learning session, we examined three eye-movement measures: First fixation duration, fixation number, and total viewing time (i.e., the sum of all fixations including refixations in an interest area). In the scanning session, in addition to these three eye-movement measures, we examined: Mean fixation duration, mean saccade amplitude, gaze duration (i.e., the sum of all first-pass fixations in an interest area prior to a fixation outside the area), mean incoming saccade length, mean outgoing saccade length, and mean landing position. In both sessions, we examined behavioural measures (e.g., mean accuracy, hit rate).

For the analysis of each continuous variable, we excluded the data beyond ±3 standard deviations from the mean by participant by condition. In the learning session, 2.35% of the data were removed from later analysis due to the trimming procedure. In the scanning session, we trimmed the data according to the ±3 standard deviations procedure. We also removed trials from the analysis of first-pass scanning measures (i.e., first fixation duration, gaze duration) if skipping occurred during first pass scanning. Together these procedures resulted in further 4.21% of the scanning session data being removed.

Learning session

In the learning session, we examined main effects of learning block, which we regarded as an index of learning development, main effects of exposure frequency, and the interactions between learning block and exposure frequency which, as per Wang et al. (2021), we refer to as the rate of learning effects. As the treatment in each learning block was identical we treated learning block as a numeric factor. The same form of analysis was conducted with respect to learning assessment blocks.

Learning blocks

For the learning blocks, the means and standard errors of the variables we examined are shown in Table 1. The corresponding results of LMM analyses are shown in Table 2. Across the learning blocks, first fixation durations were shorter in the later blocks relative to the earlier blocks. However, neither exposure frequency, nor rate of learning effects were significant for first fixation duration. Across learning blocks, total viewing time and total number of fixations were reduced in later relative to earlier blocks, indicating that learning progressed substantially over the five learning blocks. The pattern of learning-block effects on both total viewing time and fixation number were identical to those reported in Wang et al.’s Landolt-C study. However, it is important to note here, that shorter total viewing time and fewer fixations occurred in the present study compared to the Wang et al. study. Moreover, the size of the learning-block effects was greater than that in the Wang et al. study. These data suggested that pseudoword stimuli were easier stimuli to learn which resulted in more efficient learning. For total viewing time and total number of fixations, as we predicted, we found robust effects of exposure frequency showing that triplets with four exposures per block received shorter and fewer fixations than those with only one exposure block. More importantly, we found significant interactive effects of exposure frequency and learning block, that is, rate of learning effects on total viewing time and fixation number. The pattern of the rate of learning effects suggested that learning was qualified by the exposure frequency of the target triplets (see Figs. 2 and 3 for effects plot). Specifically, the rate of learning (i.e., the slope of the line) was greater for the LF targets relative to the HF targets. Importantly, this pattern of effects is quite different from that reported by Wang et al. in their Landolt-C learning and scanning experiment. In the Wang et al. experiment, the opposite pattern of effects occurred, namely, learning was faster for HF triplets than LF triplets. As can be seen from Fig. 2, the point at which the rate of learning became much more shallow was across blocks 2, 3 and 4 for HF triplets, that is, somewhat earlier than the point at which learning effects became similarly shallow for the LF triplets (across blocks 4 and 5). We consider the differential learning curves when pseudowords and Landolt-C clusters were adopted as stimuli in more detail in the Discussion.

Table 1 Mean first fixation duration (ms), total viewing time (ms) and fixation number on target pseudowords across the learning blocks
Table 2 Fixed effect estimates from the linear mixed-effects models (LMMs) for first fixation duration, total viewing time and fixation number on target pseudowords across the learning blocks
Fig. 2
figure 2

Mean total viewing time during the learning of high-frequency and low-frequency target pseudowords across the five learning blocks. The vertical lines represent error bars

Fig. 3
figure 3

The left panel plots the interactive effect between exposure frequency and learning block on log transformed total viewing time during target learning. The right panel plots the same effect observed on log transformed total fixation number

Learning assessment tasks

In the learning assessments, mean accuracy (i.e., correct recognition and correct rejection) was 87% in the initial block, increasing to 95% in the second block and reaching 97% in the final assessment block. The corresponding false alarm was 10% , 6% and 3% from the first to the final assessment block. Participants were more sensitive to the difference between targets and distractors in the present study relative to their sensitivity in the Wang et al. Landolt-C study (e.g., for the first learning assessment block, d’ for the pseudoword stimuli was 3.16; meanwhile, it was only 1.17 for the Landolt-C stimuli.) The formal analysis of mean accuracy showed a robust effect of block (β = 1.17, SE = 0.17, z = –6.99, p < .001), which again demonstrated that learning was very efficient for the pseudowords in the current study (and again, this is in sharp contrast to the effects for the Landolt-C stimuli reported by Wang et al.)

Mean hit rates, reaction times, first fixation durations, total viewing times and fixation number in the learning-assessment tasks are provided in Table 3 and the results of the GLMM and LMMs analyses are provided in Table 4. Concerning the hit rate (i.e., correct recognition of target triplets), we found robust main effects of learning assessment block and exposure frequency such that the hit rate was higher in late blocks relative to the earlier blocks, and was higher when HF targets were identified than when LF targets were identified. More importantly, as across the learning assessment blocks, we found rate of learning effects for the hit rate (i.e., interactive effects of frequency and learning assessment block). Similar effects occurred for reaction time in the target present trials. Specifically, participants spent less time identifying HF pseudoword triplets than LF pseudoword triplets during learning assessment; also, times were shorter during the earlier learning assessment blocks compared to the later learning assessment blocks. More importantly, the learning assessment block effect was modulated by exposure frequency such that the rate at which HF targets were identified was reduced relative to that at which LF targets were identified (in line with the pattern shown for total viewing times in Fig. 2). We also examined the eye movement measures across the learning assessments. There were no reliable effects for first fixation duration, however, similar to hit rate and reaction time, we found robust effects of learning assessment block, exposure frequency and rate of learning on total viewing time and fixation number. These robust effects were comparable across eye movement measures and behavioural measures adopted in the learning assessment blocks. Moreover, the patterns of effects occurring in the learning assessment blocks were also identical to those in the learning blocks. These results obtained in the learning session are very important. On one hand, the comparable patterns of results from the eye movement analyses and the analyses of the off-line, behavioural measures in the learning-assessment tasks strongly suggest that eye movements are a good index of online learning of pseudowords. On the other hand, the occurrence of complementary effects across learning and learning assessment blocks indicated that our participants did effectively learn the target triplets, and moreover, that the degree of visual familiarity differed between HF target triplets and LF target triplets, and this itself modulated learning.

Table 3 Mean hit rates, reaction time, first fixation duration, total viewing time and fixation number on targets in the learning-assessment tasks
Table 4 Fixed effect estimates from generalised mixed-effect models for hit rate and linear mixed-effects models for reaction time, first fixation duration, total viewing time, fixation number on targets in the learning-assessment tasks

To sum up, in the learning session, we obtained robust effects of learning block, exposure frequency and the rate of learning on almost all the measures we examined (except for first fixation duration) across both learning blocks and learning assessment blocks. These results clearly demonstrated that learning was very efficient and our efforts to simulate exposure frequency effects were successful in the present study. Furthermore, consistent with our predictions, learning was more efficient using pseudoword stimuli compared to Landolt-C stimuli that were adopted in Wang et al.’s Landolt-C study. Somewhat surprisingly, the rate of learning HF targets was reduced relative to that for learning of LF targets in the current study. This pattern of effects is the opposite to the learning rate effects that occurred in Wang et al.’s Landolt-C study.

Scanning session

After completion of the entire learning session, participants were required to complete a target detection task in the scanning session. The pre-learnt target triplets were embedded in longer strings of pseudowords that were presented under different formats. In the scanning session, we examined the effects of exposure frequency, spacing format and the interactive effects between exposure frequency and spacing format. The mean results for both behavioural and eye movement measures are provided in Table 5 and the corresponding analyses from the LMMs or GLMMs are provided in Table 6.

Table 5 Global measures from observations on all pseudowords and local measures from observations on target pseudowords during scanning
Table 6 Fixed effect estimates from generalised mixed-effect models and linear mixed-effects models for the global measures and local measures

We first computed behavioural measures in relation to participants’ performance in detecting target triplets within pseudoword strings: Mean accuracy (i.e., correct detection when a target was present in a trial and correct rejection when no target was present in a trial), hit rate (i.e., proportion of correct detections when a target was present) and false-alarm rate (i.e., proportion of incorrect detections when a target was absent). Compared to the final learning assessment block in the learning session, for the scanning session, mean accuracy reduced by 17% (97% vs. 80%), mean hit rate reduced by 27% (97% vs. 70%), mean false-alarm rate increased by 7% (3% vs. 10%). It was not surprising that participants were less sensitive to the differences between targets and distracters when they were presented simultaneously in longer sentence-like strings. Nevertheless, it remains the case that our participants were still able to identify the pre-learnt pseudowords that were embedded in strings on the significant majority of trials. The reduced discriminability in scanning relative to that in the learning assessment suggests that identification of target pseudowords in scanning (where distractor pseudowords likely cause interference) is a more difficult task than the identification of targets in isolation. Also, note that the detection performance during string scanning in the present study was much better than that in Wang et al.’s Landolt-C scanning experiment which was at chance level. We believe that the much better detection performance in the pseudoword string scanning compared to Landolt-C string scanning was mainly due to more effective learning of pseudoword triplets compared to Landolt-C clusters. We formally examined the main effect of spacing format on mean accuracy. The GLMMs showed no main effect of spacing format on mean accuracy. On hit rate, when exposure frequency was included as a fixed factor with spacing format in the analysis, somewhat surprisingly, there were no main effects of exposure frequency, nor was there an interaction between the two.Footnote 2 These accuracy data seem to suggest that participants were able to detect both HF and LF target triplets effectively and similarly across all the spacing conditions.

Next, let us consider the eye movement data from the scanning session. We examined two global measures that included observations from every triplet in each string: Mean fixation duration and mean saccade amplitude. We also examined several local measures that were based exclusively on data obtained from the target triplet within each string for which the exposure frequency was manipulated in the learning session: First fixation duration, gaze duration, number of fixations, total viewing time, incoming saccade length, outgoing saccade length and mean landing position.

First, we report the results from the global measures. We obtained a robust main effect of spacing format on mean fixation duration with the longest fixations made in the unspaced condition, relative to the shaded condition. Moreover, fixation durations were longer in the shaded condition relative to the spaced condition. Complementary effects of spacing and shading were also found on mean saccade amplitude. Participants made longest saccades in the spaced condition compared to the shaded condition and unspaced condition. Also, saccades were longer in the shaded condition than in the unspaced condition. These data, at a global level, demonstrate that scanning was easiest in the spaced condition, more difficult in the shaded condition and most difficult in the unspaced condition. The shading manipulation benefited target search during scanning, however, the benefits were smaller than those for the spacing manipulation.

The analysis of the local measures included the main effects of exposure frequency and spacing format, as well as the interaction between the two. The local measures we examined can be divided into fixation time and fixation location effects. We first report the results from fixation times. We found robust effects of the format of the text on first fixation duration such that initial fixations were shortest in the spaced condition, somewhat longer in the shaded condition and longest in the unspaced condition. Neither the exposure frequency effect, nor the interaction between exposure frequency and spacing format were significant for first fixation duration. For gaze duration, total viewing time and fixation number, we found similar results as for first fixation duration. Specifically, longer gaze durations, total viewing times and increased numbers of fixations in the unspaced condition compared to spaced condition and shaded condition. Also, longer gaze durations, total viewing times and increased numbers of fixations occurred in the shaded condition compared to the spaced condition. Again, there were no exposure frequency effects, nor interactive effects of frequency and spacing for gaze duration, total viewing time and total number of fixationsFootnote 3.

Next, we report results of our fixation location analyses including incoming saccade length into the targets, outgoing saccade length from the targets and the mean landing position on the targets. We obtained robust effects of spacing format on all the fixation location measures that we examined. Specifically, the longest incoming saccades and outgoing saccades were made in the spaced condition. Saccades were somewhat shorter in the shaded condition and shortest in unspaced condition. Similarly, the mean landing position was furthest into the target triplets in the spaced condition, less far into the triplets in the shaded condition and least far in the unspaced condition. However, there were no exposure frequency effects on incoming saccade length, outgoing saccade length, nor on mean landing position. Also, there were no interactive effects between exposure frequency and spacing format on incoming saccade length, outgoing saccade length, nor on mean landing position.

Based on the fixation time measures we obtained under the different visual presentation conditions, it appears that readers experienced least difficulty with the task in the spaced condition, somewhat more difficulty in the shaded condition and most difficulty in the unspaced condition. The fixation location results also demonstrate that the visual format of the strings impacted saccadic targeting decisions. Readers appear to target saccades further into strings when processing of that string was easier.

When we consider the initial landing position distributions (see Fig. 4), two main points are apparent.Footnote 4 First, we found that exposure frequency did not affect saccadic targeting to any great degree. Second, in general, there were two differential types of landing position distributions. In the spaced and shaded conditions, landing position distributions presented an inverted-U shape with a peak close to the middle. Specifically, in the shaded conditions, participants directed saccades to the centre of the target triplet, and in the spaced condition, the peak of the inverted-U shifted half a character to the right of the centre of the target. By contrast, in the unspaced conditions, the landing position distribution showed a peak at the target triplet beginning which declined through the triplet. These results are very interesting in that these two distinctive patterns of landing position distributions are very similar to distributions that have been reported for words in real languages during natural reading (e.g., Rayner et al., 1998; Zang et al., 2013), for novel words during learning (Liang et al., 2021), and these patterns were also reported for Landolt-C scanning under different format conditions (Wang et al., 2021). We will consider similarities and differences between the present landing position patterns and those reported previously in the Discussion.

Fig. 4
figure 4

Initial landing position distributions on the target triplet across all conditions in the scanning session. HF high frequency, LF low frequency

To summarise, in the scanning session, participants were able to detect approximately 70% of the target triplets that were embedded within the strings, showing a sharp contrast to the chance level detection performance reported during Landolt-C string scanning in the Wang et al. Landolt-C study. Nevertheless, in line with the Wang et al. study, it remains the case that exposure frequency showed no influence on target detection during pseudoword string scanning. Nor did we find any interaction between exposure frequency and spacing format on either behavioural or eye movement measures. By contrast, we found robust spacing effects on every eye movement measure that we examined. Both fixation locations and fixation times were markedly affected by spacing format. The presence of triplet boundary demarcations in either the form of spacing, or alternation shadings, facilitated both target triplet identification and saccadic programming, though the facilitation was smaller in the shaded conditions than the spaced conditions.

Discussion

Recall that, in their Landolt-C study, Wang et al. (2021) reported the occurrence of exposure frequency effects in the learning session,;however, the exposure frequency effects did not maintain to the subsequent scanning session in which participants were required to detect a target Landolt-C cluster that was embedded in Landolt-C strings. Wang et al. argued that the reason that exposure frequency effects did not occur during the scanning of Landolt-C strings might simply due to participants’ failure to maintain accurate memory representations of pre-learnt targets through to the later scanning session. Therefore, the key motivation in the current study was to examine whether using pseudowords stimuli, that is stimuli that are more word-like, would simulate exposure frequency more effectively relative to Landolt-C stimuli during learning, and whether exposure frequency would be more likely to occur during scanning on the assumption that participants were more able to maintain memory representations for targets and therefore detect those pre-learnt targets embedded within longer strings.

In line with our predictions, we found learning was much more effective and successful using pseudowords stimuli compared to the Landolt-C stimuli that were used in the Wang et al. study. Mainly, this was evidenced by increased recognition accuracy, decreased processing time and greater learning-block effects for the pseudoword stimuli compared to Landolt-C stimuli. Given that the exposure manipulation during the learning session was identical across the two studies, as argued earlier, it seems very likely that the nature of the stimuli affected the extent to which targets were effectively memorised. We provide three explanations as to why learning was more effective for pseudowords compared to Landolt-C clusters. First, all the constituent elements forming pseudowords are English letters that already exist in the alphabetic language of the participants that were tested in this study. Of course, this was not the case for the constituent Landolt-Cs that comprised the Landolt-C triplets. Thus, during the learning of pseudowords, participants were not required to learn novel constituent elements comprising the strings as there were existing representations for each letter in memory. By contrast, during the learning of Landolt-C clusters, participants first had to learn and represent novel, abstract, specific Landolt-C rings with different orientations that were each very similar. Presumably, the lack of familiarity with the Landolt-C stimuli alongside the lack of existing memory representations for those stimuli increased the difficulty of triplet learning. A second obvious difference between pseudoword and the Landolt-C learning is that pseudowords in the present study were pronounceable, whereas Landolt-C strings were not. Because of this, participants were able to encode triplets in relation to both their phonological and orthographic characteristics, presumably therefore, resulting in a richer and more memorable representation which likely facilitated pseudoword learning. Third, the reduced similarity between pseudoword triplets (both in relation to their orthographic and phonological forms) will have contributed to more effective learning relative to learning of Landolt-C clusters through reduced competitor interference effects. To reiterate, in the scanning session, target detection performance was much better in this study relative to that in the Landolt-C study, and this result was entirely consistent with our suggestion that pseudoword learning would be much more effective and successful than Landolt-C string learning.

In the learning session, we obtained robust effects of learning block during the learning and during the learning assessment of target pseudowords. Regardless of exposure frequency, time spent processing the targets decreased over blocks indicating that learning of targets progressed effectively through repeated exposure. This type of learning effect was observed in both the present study and in the Wang et al. Landolt-C study. Learning-block effects occurred because as learning block increased, participants experienced more exposure to the targets, and with each additional exposure, the process of instantiating, storing and maintaining the representations of the novel items in memory was enhanced. Despite the general pattern of learning effects being consistent across the two experiments, the magnitude of learning-block effects appeared to vary as a function of the stimulus type being learnt – learning effects were greater in the current study compared to those obtained in the Wang et al. Landolt-C study. Again, greater learning-block effects indicated that learning was more efficient when pseudoword stimuli were used relative to when Landolt-C stimuli were used. Consistent findings of basic learning effects across the two experiments demonstrated that the novel learning and scanning paradigm developed for the Wang et al. study effectively captures and reflects aspects of the nature of learning novel stimuli in isolation. Despite variability in the rate and the extent to which learning occurred, the process of actually establishing novel items in memory did take place in both experiments and to the extent that memory representations were established and were accessible, this fundamental aspect of the results was similar across the two types of stimuli that were examined.

Next, let us consider the exposure frequency effect we aimed to simulate during the learning session. As predicted, robust exposure frequency effects occurred during both the learning and learning assessment blocks. Targets with four exposures per block were learnt faster and more effectively relative to those with just one exposure per block. The same pattern of exposure frequency effect occurred for the learning assessment blocks. That is, targets for which participants received four exposures per learning block were identified faster than those for which participants only received one exposure per learning block. Exposure frequency effects arose because the visual familiarity of the targets increased with increased stimulus exposure. The more visually familiar a target was, the less the time was required to access the representation in memory in order to recognize it during a subsequent encounter. Given that the exposure frequency effects appeared in the learning assessments as well as the eye movement data across learning trials, one might consider that these effects were somewhat analogous to the well-documented word frequency effects observed in learning during reading and lexical decision tasks (e.g., Blythe et al., 2012; Hulme et al., 2019; Inhoff & Rayner, 1986; Joseph & Nation, 2018; Joseph et al., 2014; Just & Carpenter, 1980; Liang et al., 2015; Liang et al., 2017; Pagán & Nation, 2019; Rayner & Duffy, 1986; Schilling et al., 1998; Whaley, 1978). At some level, the effective simulations of exposure frequency effects in both the learning and learning assessments of the present experiment might be informative as to how word frequency effects become established in languages (at least during the earliest stages of novel word acquisition). Of course, an important caveat must be made here, namely, that the stimuli that were used in the present were (quite purposefully) not real words and attributes of semantic meaning were absent. Presumably, proper words would involve the development of richer lexical representations in memory, and therefore, perhaps frequency effects for real novel words might develop even more rapidly than effects reported here. This is an empirical question for future investigation. What is clear is that despite the fact that word frequency effects are prevalent, well-documented, and that word frequency is considered one of the most important lexical characteristics of words, as well as one of the primary influences over lexical processing, little is known about how such effects are established (cf., Williams & Morris, 2004).

Perhaps a more interesting aspect of the findings from the learning session relate to the rate of learning effects (i.e., the interaction between learning block and exposure frequency). Compared to the Wang et al. Landolt-C study, the present study showed a quite different pattern of learning rate effects. To be specific, Wang et al. reported frequency effects with increased magnitude across learning blocks meaning that learning rate was greater for HF target Landolt-C clusters relative to LF target Landolt-C clusters. In contrast, in the present study, we found that the magnitude of exposure frequency effects decreased over learning and assessment blocks which suggested that the rate of learning for HF target triplets was reduced relative to that for LF target triplets. Moreover, learning for HF pseudowords reached a reduced plateau much earlier than learning for LF pseudowords. We consider that the major reason for the different rate of learning effects across experiments was the nature of stimuli. The Landolt-C stimuli adopted in the Wang et al. study were much more difficult to learn compared with the pseudoword stimuli adopted in the present study. It seems that the ease, or difficulty, of learning materials determines how the learning curve presents. Learning difficult stimuli presents a long gentle learning curve such that learning progresses at a slow rate in the initial stages, as was the case in learning Landolt-C clusters. In contrast, learning relatively easy materials presents a much shorter learning curve such that the learning rate is rapid resulting in a curve that is steep in the initial stages and then plateaus, meaning that further improvements over successive learning exposures will result in only very minor reductions in processing time. This was the case in the learning of pseudowords. Presumably, if more learning blocks, and therefore exposures, were presented to participants when learning Landolt-C stimuli, the learning curve would likely show three clear distinctive stages of learning – a slow progressive stage in the beginning, a steeper progression in the middle, and eventually a plateau during the final stages of learning. However, as noted, to see the full pattern over extended learning would require a much greater number of learning blocks than was used in the Wang et al. Landolt-C experiment, and consequently, the present suggestions remain speculative. Finally, it is also worth mentioning that, in line with the broader set of learning effect findings in both the present study and the Wang et al. Landolt-C study, any such learning curve would demonstrate qualification by exposure frequency such that HF targets would show more rapid learning effects relative to LF targets.

Recall that, one of the major motivations for the present study was to examine whether using stimuli that are comparatively easy to learn would increase the possibility of observing exposure frequency effects in a later scanning session. As discussed earlier, learning was very effective and successful in the learning session. Consequently, and in line with our predictions, the memory for pre-learnt target pseudowords maintained to the scanning session to a greater extent. This is in contrast with the chance-level target detection performance reported in Wang et al.’s Landolt-C study. Nevertheless, despite quite effective pseudoword learning, we still did not observe any reliable exposure frequency effects during scanning for any of the measures we examined. The results of the present study clearly demonstrated that the lack of the exposure frequency effects during scanning was not due to memory decay of pre-learnt targets. We say this because in the present study, participants were able to effectively detect targets during scanning (thereby reflecting good memory for the targets). In line with Wang et al.’s account, we suggest here too that the lack of exposure frequency effects in the scanning session might likely be due to: (1) the exposure frequency simulation being insufficiently strong to induce frequency effects in scanning, and (2) the contemporaneous appearance of multiple distractor triplets together with the target triplet (within the same string) diminishing the magnitude of any potential exposure frequency effect. Alternatively, the lack of frequency effects in scanning session could have arisen because participants were engaged in scanning rather than reading. It raises the question that whether these effects would occur when participants are required to read sentences containing pseudowords rather than simply scan them searching for a target. Actually, in the literature of novel word learning in reading, increasing evidence show that the more exposures a novel word received during learning the faster it could be identified – robust frequency effects (e.g., Hulme et al., 2019; Liang et al., 2015; Liang et al., 2017; Liang et al., 2021).

Existing evidence suggests that word frequency does not affect eye movements when the task requires participants to search for a specified target word within normal text (Rayner & Fischer, 1996; Rayner & Raney, 1996; Wang et al., 2019). By contrast, it is still controversial as to whether such frequency effects occur when the task requires participants to search for a target within text-like non-reading stimuli. The present study, together with the Wang et al. Landolt-C study, produced no influence of exposure frequency on eye movements when participants searched for a target within text-like strings. To our knowledge, there is only one study that has demonstrated exposure frequency effects in a reading-like visual search task (Vanyukov et al., 2012). In the Vanyukov et al. study, participants were required to search for a target ‘O’ within text-like Landolt-C strings. They manipulated the exposure frequency of distractor clusters that occurred within their Landolt-C string stimuli (10 exposures, 25 exposures and 50 exposures). Vanyukov et al. (2012) argued that in previous studies, searching for a target word (e.g., zebra) in normal text might have caused participants to engage in very superficial visual processing (e.g., zebra can be easily discriminated from most other words based on its initial letter) and this may have led to a lack of a word frequency effect during visual search. Note, though, that in Vanyukov et al.’s study, a Landolt-C cluster that contained a target letter ‘O’ was very visually similar to the distractor Landolt-C clusters (that did not contain a letter ‘O’) that appeared simultaneously. Thus, it seems very likely that compared to a situation in which participants searched for a target word (e.g., zebra) within normal text, searching for a target ‘O’ embedded amongst Landolt-C clusters would actually be relatively difficult because of the high visual similarity between the target and distractors. On this basis, Vanyukov et al. (2012) argued that the more cognitively demanding the reading-like visual search task, then the greater the likelihood that an exposure frequency effect would occur. If this explanation is correct, however, then it seems very likely that in the Wang et al. Landolt-C study, exposure frequency effects should have occurred because, relative to the Vanyukov et al. study, target-distractor similarity in Wang et al.’s Landolt-C study was, arguably, even greater. Thus, on the basis of Wang et al.’s Landolt-C study, it seems that Vanyukov et al.’s suggestion that task demands may be a determinant of exposure frequency effects may not be correct. An alternative possibility may be related to a potentially important point of difference between the study of Vanyukov et al. and the Landolt-C and pseudoword string tasks of Wang and colleagues. In both the Wang et al. Landolt-C study and the current study, the targets that were to be detected within strings were manipulated for exposure frequency during a prior learning session. In contrast, in the study by Vanyukov et al., exposure frequency of distractor stimuli was manipulated. Thus, it is possible that the target/distractor status of a learnt string may be important in relation to whether exposure frequency effects are observed. This is clearly an empirical issue that requires further attention.

Next, we consider how the spacing and shading manipulations facilitated eye movements during the scanning of pseudoword strings. Consistent with the Wang et al. Landolt-C study, we found both spacing and alternating shadings facilitated eye movement control with evidence of shorter fixation durations on, and longer saccades to, target triplets in spaced strings and shaded strings compared to unspaced unshaded strings (see also Perea et al., 2009; Perea et al., 2015; Zhou et al., 2018). The presence of either inter-triplet spaces or alternating shadings very likely facilitated scanning due to the provision of overt visual demarcations of triplet boundaries within the horizontally spatially extended strings. Knowing where a triplet started and ended allowed readers to direct their saccades towards an intended position within that triplet much more readily than when its spatial extent was not visually marked. Also, the presence of spacing/shadings eliminated the occurrence of triplet boundary ambiguity. When letters or characters are immediately adjacent, there is often ambiguity as to whether they belong together as part of a word, or do not (e.g., understandingeniousideas vs. understandingingeniousideas). Also, it is important to note that the facilitatory effects of spacing were greater than those of shading. This is very likely because reduced lateral masking and reduced crowding exist in the spaced strings relative to the shaded unspaced strings. It is well-documented that processing of foveal information is less effective when the neighbouring perceptual units laterally mask and crowd a stimulus (e.g., Wolford & Chambers, 1983).

Interestingly, we found differential patterns of initial landing position distributions between the Wang et al. study (2021) and the present study. In both studies, saccades were more frequently targeted towards the centre of a target in the spaced strings, whereas saccade targeting was shifted towards the beginning of a target in the unspaced strings. This aspect of the landing position data was quite consistent between the two experiments. Importantly, the discrepancy between the studies occurred in the shaded strings. In the present study, participants were more likely to direct their saccades towards the centre of a target string when alternating shadings were present. However, in the scanning of shaded Landolt-C strings, initial landing positions were more likely to land towards the beginning of a target string. Thus, it seems that alternating shadings did not facilitate saccadic targeting in the scanning of Landolt-C strings to a similar degree to which they did in the scanning of pseudoword strings. It is very interesting to consider why the extent to which alternating shadings facilitated saccadic targeting differed across the two experiments. Once again, we believe that this is very likely due to the differential nature of the stimuli used in the two studies. As discussed earlier, Landolt-C clusters provide no information pertaining to sound, meaning, or other linguistic properties and they were very difficult to learn and to identify during scanning. By contrast, English pseudoword triplets are orthographically regular and pronounceable. That is to say, when compared to Landolt-C clusters, pseudowords appear much more word-like. Because of the word-like properties (i.e., fairly regular orthography and fairly simple pronounceability), individual pseudoword triplets are more likely to be processed as a single perceptual unit compared to Landolt-C clusters. Despite the fact that alternating shadings indicated the beginning and end of a cluster unit, due to the nature of the Landolt-C clusters, it remains likely that participants were less effective in processing the visually demarcated cluster as a single perceptual unit. Instead, it seems possible that they maintained more piecemeal representations of targets, perhaps resulting in a Landolt-C-by-Landolt-C identification strategy during scanning. In support of this suggestion, the refixation rate on target strings in the Landolt-C experiment was far greater than that in the pseudoword experiment. And consistent with this suggestion, the eyes were more frequently directed to the beginning of a triplet in order to process that triplet on the basis of its individual constituent Landolt-Cs rather than as a unified perceptual unit. Conversely, the presence of alternating shadings in the pseudoword strings likely allowed participants to capitalise upon the word-like characteristics of the triplets, meaning that participants were able to process them triplet-by-triplet (i.e., process them more as unified perceptual units rather than in relation to their constituent parts). Under such circumstances, it is not particularly surprising to see that participants adopted a saccadic targeting strategy similar to that occurring in natural reading, that is, saccadic targeting towards the centre of an upcoming word unit.

To a significant extent, the current findings with respect to saccadic targeting in the scanning of text-like strings, with or without inter-triplet spaces, are very relevant to an important theoretical issue that has been discussed for two decades in the Chinese reading literature. That is, whether or not saccadic targeting in the reading of unspaced Chinese text is word-based (Li et al., 2011; Liang et al., 2021; Ma et al., 2015; Tsai & McConkie, 2003; Yan et al., 2010; Zang et al., 2011; Zang et al., 2013). By splitting the initial landing position data into single-fixation cases and multiple-fixations cases, Yan et al. (2010) found that readers tend to target slightly left of the centre of words in single-fixation cases; however, readers locate their point of fixation towards word beginnings in multiple fixation cases. Accordingly, Yan et al. (2010) proposed that Chinese readers dynamically choose to target a saccade based on whether or not the upcoming word was successfully segmented in the parafovea. Interestingly, this general pattern of findings holds regardless of whether the same text is presented in a spaced or an unspaced format (see Zang et al., 2013), whether the text is normal or shuffled (Ma et al., 2015) and even holds in computational simulations (see Li et al., 2011).

According to the account provided by Yan et al. (2010), similar saccadic targeting behaviours should have occurred for the shaded strings across the present study and the Wang et al. Landolt-C study given that shadings provided clear boundary information in the parafovea in both situations. However, discrepancies occurred in the shaded condition across the two studies. In the scanning of shaded Landolt-C strings, participants made most initial fixations on the beginning of a target. By contrast, in the scanning of pseudoword strings, inverted-U shaped landing position distributions occurred. The results from the present study and those from the Wang et al. Landolt-C study, tend to suggest that the factor driving the changes in landing position distributions on the target was not whether participants could, or could not, identify the boundaries of the upcoming target in string scanning. Instead, in our view, the reason for the quite different saccadic targeting behaviour in the present study and that observed in the Wang et al. Landolt-C study is very likely the nature of target processing. In the Landolt-C study, participants engaged in letter by letter processing of triplets, whereas in the present study, participants processed the pseudoword letter strings like word units (i.e., as a whole). It seems very likely that such a difference in the nature of processing likely drove the differences in saccadic targeting that were observed across the two studies. A further point that is perhaps worth noting in relation to saccadic targeting differences between Wang et al.’s Landolt-C study and the present study is that these differences occurred even though participants in both studies knew perfectly well that the length of target and distractor strings was constant throughout (i.e., three characters). This suggests that computations associated with saccadic targeting metrics are relatively immune to metacognitive influences.

Conclusion

In the present study, robust learning and exposure frequency effects emerged across learning sessions indicating that the novel learning and scanning paradigm developed in the Wang et al. Landolt-C study is also effective for pseudoword stimuli. Different patterns for rates of learning occurred in this study compared to in the Wang et al. study, suggesting that the nature of the stimuli to be learnt directly impacts the speed at which exposure frequency effects in learning are established. Finally, consistent with previous findings, we obtained robust spacing effects but no exposure frequency effects in scanning during target search. Inter-pseudoword spaces facilitated both target identification and saccadic targeting with a reduced effect for alternating shadings relative to unspaced stimuli. Taken together, these findings alongside those of the Wang et al. study demonstrate a dissociation in exposure frequency effects during target string learning relative to target string recognition, a direct relationship between the characteristics of novel stimuli to be learnt and the nature of their learning, and the efficacy of spacing as a perceptual boundary demarcation method to facilitate saccadic targeting in scanning.