Closing the door to false memory: the effects of levels-of-processing and stimulus type on the rejection of perceptually vs. semantically dissimilar distractors

Nieznański, Marek; Obidziński, Michał

doi:10.1007/s00426-021-01544-z

Closing the door to false memory: the effects of levels-of-processing and stimulus type on the rejection of perceptually vs. semantically dissimilar distractors

Original Article
Open access
Published: 10 June 2021

Volume 86, pages 968–982, (2022)
Cite this article

Download PDF

You have full access to this open access article

Psychological Research Aims and scope Submit manuscript

Closing the door to false memory: the effects of levels-of-processing and stimulus type on the rejection of perceptually vs. semantically dissimilar distractors

Download PDF

2215 Accesses
3 Citations
2 Altmetric
Explore all metrics

Abstract

False recognition memory for nonstudied items that share features with targets can be reduced by retrieval monitoring mechanisms. The recall-to-reject process, for example, involves the recollection of information about studied items that disqualifies inconsistent test probes. Monitoring for specific features during retrieval may be enhanced by an encoding orientation that is recapitulated during retrieval. In two experiments, we used concrete words or door scenes as materials and manipulated the level of processing at study and the type of distractors presented at test. We showed that for the verbal material, semantic level of processing at study results in an effective rejection of semantically inconsistent distractors. However, for the pictorial material, the perceptual level of processing leads to an effective rejection of perceptually inconsistent distractors. For targets, the effect of levels of processing was observed for words but not for pictures. The results suggest that retrieval monitoring mechanisms depend on interactions between encoding orientation, study materials, and differentiating features of distractors.

Can cue familiarity during recall failure prompt illusory recollective experience?

Article 01 December 2021

Andrew M. Huebert, Katherine L. McNeely-White & Anne M. Cleary

Beneficial Effect of Pictures on False Memory in the DRMRS Procedure

Article 21 November 2015

Sarah R. Benmergui, Stuart J. McKelvie & Lionel G. Standing

A memory-interference versus the “dud”-effect account of a DRM false memory result: Fewer related targets at test, higher critical-lure false recognition

Article 22 March 2022

Jerwen Jou & Mark Hwang

Introduction

Incidents of false memories depend on two classes of processes: error-inflating processes that are generally based on familiarity (activation) increased by the shared attributes of targets and lures, and error-editing processes that overall depend on the recollection (monitoring) of features that are distinctive (e.g., Arndt & Gould, 2006). The recollection of study details can be used to avoid false recognition through such decision mechanisms as disqualifying monitoring or diagnostic monitoring (Gallo, 2006). The first occurs when the remembering of one event logically excludes another event as being presented during the study, while the second mechanism is based on the failure to recollect the expected details (cf. Nieznański et al., 2018). For example, in a converging associates memory task, a disqualifying monitoring process can lead to rejection of a related lure because one recalls identifying it as nonstudied during study, whereas a diagnostic monitoring process can result in rejecting an unrelated lure because it does not fit the gist of studied items (Gallo, 2006, p. 204). When the distinctiveness of the study material is enhanced, retrieval monitoring becomes more effective since subjects can make use of the unique features and qualitative differences between targets and lures (Gray & Gallo, 2015). One of the manipulations that was used to influence such recollective distinctiveness was a deep vs. shallow level of processing (Gallo et al., 2008). Our aim was to investigate the effects of levels of processing on disqualifying monitoring depending on the kind of study material (concrete words vs. door scenes). In two experiments, we crossed encoding tasks orienting subjects’ attention toward semantic vs. perceptual features with semantic vs. perceptual features that may be used to exclude lures at retrieval.

In false memory literature, “recall-to-reject” processes are described that involve the recollection of information which eliminates the recognition probe as having occurred (e.g., Gallo, 2004; Gallo et al., 2006, 2007). In other words, these processes facilitate the rejection of test probes that are similar to targets by detecting their mismatch on some of the features (e.g., Carneiro et al., 2012; Rotello et al., 2000). For example, a subject may recollect the colours of the studied targets that are different from the colours of the recognition probe, thus, rejecting it as a lure. However, there are at least two necessary conditions for the effective use of such a strategy. The first involves recalling the entire range of features presented at study that are relevant for the testing probes. The second is adopting a proper retrieval orientation that enables the monitoring of distinctive information at retrieval.

The classic transfer appropriate processing framework (Morris et al., 1977) suggests that performance on a memory task is enhanced by increases in the overlap between the processes carried out during encoding and those carried out during test (e.g., Nieznański, 2014). More recent research posits that people attempt to take advantage of transfer appropriate processing by recapitulating the operations performed at the time of encoding during retrieval (e.g., Alban & Kelley, 2012; Zawadzka et al., 2017). Therefore, subjects search memory differently when their task is to recognize items that were encoded with a semantic task than when the task is to recognize perceptually encoded items (Kantner & Lindsay, 2013). Jacoby and colleagues (e.g., Halamish et al., 2012; Jacoby et al., 2005) use the notion of source-constrained retrieval for a kind of early selection in which the retrieval processing is constrained in a way that recapitulates study processing (cf. Alban & Kelly, 2012; Danckert et al., 2011; Marsh et al., 2009). A related concept of retrieval orientation was proposed by Rugg and colleagues (e.g., Morcom & Rugg, 2012; Rugg & Wilding, 2000) for a goal-directed strategy adopted at retrieval. They demonstrated—using brain imaging and evoked potentials—that new items on a recognition memory test are processed in a way that depends on how the targets were encoded at study (cf. Kantner & Lindsay, 2013; Zawadzka et al., 2017).

Levels-of-processing effect with verbal vs. pictorial material

The levels-of-processing (LoP) effect refers to the well-known impact of orienting tasks during study on subsequent memory test performance. A hierarchy of qualitatively different levels of analysis can be defined, starting with a sensory (shallow) analysis of the perceptual properties of the to-be-remembered items, and proceeding towards a more elaborate (deep) processing of meaning and semantic associations of the items (e.g., Craik, 2002; Craik & Lockhart, 1972). As shown in numerous studies, semantically encoded items are typically better remembered than perceptually encoded stimuli (e.g., Craik & Tulving, 1975). Such positive LoP effects were usually studied and demonstrated with verbal material. However, a relatively small number of studies have used pictorial material, yielding considerably mixed results (Baddeley & Hitch, 2017; Nieznański, 2020). In some studies, particularly those using pictures from a single broad category (e.g., faces or door scenes), semantic processing led to superior memory performance (Baddeley & Hitch, 2017; Bower & Karlin, 1974; Konstantinou & Gardiner, 2005). But in other studies using more distinctive pictorial material, reversed LoP effects have been reported. For example, in the Intraub and Nicklos (1985) study, questions orienting participants towards the visual characteristics of objects led to a better recall than semantic questions. In our recent study (Nieznański, 2020) using the conjoint recognition paradigm, we also found significant negative LoP effects for pictures. Process-level analyses of the components involved in memory performance that we conducted from the perspective of the dual-recollection theory (Brainerd et al., 2014, 2015) demonstrated a significant enhancement of pictures’ context recollection and null effects for target recollection in the perceptual encoding condition in comparison with the semantic encoding condition. In contrast, for verbal material, both context and target recollection were enhanced in the semantic condition. Following the sensory-semantic model (Nelson & Reed, 1976; Nelson et al., 1976, 1977), we can assume that picture recollection may benefit from perceptual encoding due to the greater physical distinctiveness of pictures in comparison with words (Intraub & Nicklos, 1985). However, restricting picture diversity by using stimuli similar in appearance might eliminate this advantage and render semantic features relatively more effective in differentiating the targets from the distractors (cf. Baddeley & Hitch, 2017).

Modest but consistently positive LoP effects were recently reported by Baddeley and Hitch (2017) in a series of experiments using sets of visual stimuli taken from single categories such as doors, clocks or mobile phones. When comparing recognition memory for pictures and words, in one of the experiments, Baddeley and Hitch used lists of door scenes, half of which were predominantly brown, and lists of concrete words printed in various font colours, half of which contained a majority of brown letters. The stimuli were processed either deeply, where the participants judged if they found each item pleasant, or shallowly, in terms of assessing whether the stimulus was predominantly brown. At a four-alternative forced-choice recognition test, three similar distractor items were presented along with the target, but the type of the target–distractor similarity was not systematically manipulated. Deep encoding resulted in better memory in the case of both doors and words, but the effect was markedly smaller for pictorial than for verbal materials. In subsequent experiments, modest LoP effects were confirmed with various pictorial stimuli, whereas for verbal materials, the effects varied much more depending on the available features associated with the verbal material. Such features, according to Baddeley and Hitch, are “offered” by stimuli to be elaborated and may potentially enhance target encoding and their differentiation among distractors. The more diagnostic features are processed, the greater the probability of successful recognition. Baddeley and Hitch proposed the concept of affordances taken from James J. Gibson’s (1977) classical theory of perception as a suitable term for expressing the relationship between the subject and the to-be-remembered materials. From this perspective, a word affords a relatively impoverished perceptual stimulus, but one that can be referred to a rich and complex network of lexical associations. In contrast, a picture of a domestic door affords a broad range of visual features that might enhance stimulus familiarity at retrieval, but are relatively useless as diagnostic features when presented among similar domestic doors as distractors. Taking into account the differences between the perceptual and semantic features in their potential impact on recognition memory, Baddeley and Hitch proposed a distinction between perceptual and semantic affordances. In the current study, we further investigate the differences in the consequences of perceptual and semantic feature processing, this time for false memory. We systematically manipulate the type of features in which the distractors share or differ in relation to the targets, rendering the presence (or absence) of certain features useful for retrieval monitoring.

Overview of the experiments

The aim of our research is to demonstrate that retrieval monitoring depends on the interaction between level of processing, materials, and lure type (cf. Chan et al., 2005). Across experiments we compare the usefulness of perceptual and semantic affordances (Baddeley & Hitch, 2017) in retrieval monitoring for words vs. pictures. We hypothesize that subjects monitor their memory for specific features that were distinguished at study by orienting tasks. In particular, we assume that subjects who focus their attention on the colours of the target at study attempt to recapitulate this orientation at retrieval, in consequence, the targets and lure items are monitored for containing appropriate colours. Conversely, subjects who focus on the semantic categories of targets at study disqualify the lures belonging to new categories at test. Moreover, such retrieval orientations should be more successful the more distinctive are the features of targets. Previous research (e.g., Gallo et al., 2008) suggests that a deep level of processing enhances the recollective distinctiveness for words. However, it is not clear whether the same effect will be present for pictures. On the one hand, the research of Baddeley and Hitch (2017) suggested that the distinctiveness of such pictorial material as door scenes will benefit from semantic processing in comparison with the perceptual encoding task (at least when nondistinctive distractors are used at test). On the other hand, our recent research (Nieznański, 2020) has indicated that context recollection for pictures is enhanced by perceptual processing. Hence, it is probable that the colour orienting condition will result in more effective colour recollection and, in consequence, better disqualification of colour-inconsistent lures.

In Experiments 1 and 2, we used similar materials to Baddeley and Hitch (2017), that is, lists of concrete words printed in coloured fonts (Experiment 1) and lists of door scenes (Experiment 2) encoded either under a semantic or a colour orienting task. At test, in contrast to the Baddeley and Hitch studies, we manipulated the kind of lures that were presented, that is, we used “colour lures” containing colours consistent with the colours presented at study but belonging to inconsistent categories, “category lures” belonging to consistent categories but containing inconsistent colours, and “critical lures” that were consistent both in category and colour with the study items. According to the global matching models of recognition memory (e.g., Arndt, 2015), false recognition is a function of the match between a lure used as a memory probe during test and the memory traces of related targets. Therefore, memory performance in our experiments will depend both on the error-inflating processes based on lure consistency, and the error-editing processes based on lure inconsistency with the encoded traces.

In both experiments, apart from the standard analyses of hit rates, false alarm rates, and mean confidence of responses, we used two alternative measurement models, namely the signal detection (SDT) model, and the two-high-threshold model (2HT) for recognition memory. Such twofold modelling analysis was motivated by a still indecisive debate in literature as to whether recognition performance is better described as based on a continuous memory process (SDT) or discrete states (2HT) (e.g., Bröder & Schütz, 2009; Dube et al., 2012; Juola et al., 2019; Malejka & Bröder, 2019). Both measurement models enable some specific interpretations of the results. For example, on the one hand, Huff and Bodner (2013) have recommended SDT analyses as a way to disentangle encoding (generally, error-inflating) from retrieval (generally, error-editing) influences. The latter would rather affect the response criterion parameter, while the former is expected to influence the memory sensitivity parameter of the SDT model (cf. Nieznański et al., 2018). On the other hand, the 2HT model introduces a parameter representing high-confidence “no” responses for lures detected as distractors, which may be interpreted as a manifestation of the recall-to-reject process (Rotello et al., 2000). This way we can compare effectiveness of this monitoring mechanism between lure types depending on the encoding condition.

Experiment 1

Methods

Participants

The participants were 53 undergraduates who received course extra credits for volunteering. Their mean age was 19.9 years (SD = 0.90); 11 were men. Each participant was assigned to one of two conditions differing by the encoding instructions: colour naming (N = 27) and category naming (N = 26). The numbers of participants per group were similar to the numbers of participants in the Baddeley and Hitch (2017) Experiments 1 (N = 20) and 2 (N = 24), which used similar materials and conditions. A sensitivity analysis using G*Power 3.1 (Faul et al., 2007) revealed that, assuming a power of 0.80, with our sample size (N = 53), the experiment is sufficiently sensitive to detect a small-to-medium effect size of f = 0.18, for ANOVA with repeated measures and within–between interaction.

Materials

The material comprised a list of 78 nouns (mean length 6.4 letters, ranging from 3 to 11 letters) belonging to six different semantic categories and containing a majority of letters in one of six font colours. These words were assigned to the following sets: (a) 45 targets: 15 names of animals, 15 clothes, and 15 fruits; for each of these semantic categories a third was printed predominantly in red, a third in green, and a third in brown font colour; (b) nine critical lures: words belonging to the same sematic categories and containing a majority of letters in the same colours as the targets; (c) nine category lures: words belonging to the same semantic categories as the targets but differing in the predominant font colour (grey, blue or yellow); (d) nine colour lures: words with a majority of red, green or brown letters (as targets) but differing from the targets in the semantic category (furniture, tools or musical instruments); and (e) six study items similar in category and colour to the targets serving as primacy and recency buffers. All the words were presented in 60-point Times New Roman font. The first letter of each word was in its predominant font colour. For shorter words, all the letters except one were presented in the predominant colour; for longer words, two or three letters were presented in different nondominant colours. When the majority of letters were presented in red, green or brown, the colours of the remaining letters were chosen at random from grey, blue or yellow colours, and vice versa. The background screen was black.

Procedure

The procedure is schematically depicted in Fig. 1. At study, the participants were instructed to try to memorize all the presented words and to answer the orienting question using a keyboard. In the category condition, the participants were asked to press one of three keys that corresponded to the animals, clothes or fruits categories. In the colour condition, they were asked to indicate whether most letters of the presented word are brown, green or red.

The 45 target words (3 colours × 3 categories × 5 exemplars) were presented at study in a random order at a rate of 4 s with an interstimulus interval of 250 ms. Three words were added as buffers at the beginning and another three at the end of the study list. All the words were displayed in the centre of the computer screen. The response options were prompted in a white frame below the target word.

At test, 27 targets (3 colours × 3 categories × 3 exemplars) mixed with 27 distractors (9 critical lures, 9 category lures, and 9 colour lures) were presented in a random order. The participants were asked to recognize items using a 4-point confidence scale: 1 (definitely new), 2 (probably new), 3 (probably old), 4 (definitely old). The response options were displayed in a white frame at the bottom of the slide. The test trials were participant-paced with the next trial appearing immediately after a response.

The participants were examined at individual workstations in the University Lab. The presentation of the stimuli and the response recording were controlled using the E-Prime program 2.0 (Psychology Software Tools, Pittsburgh, PA).

Data analysis

Analysis of variance (ANOVA) An α level of 0.05 was used for all statistical tests. For repeated measures ANOVA, whenever the assumption of sphericity was not met, as indicated by Mauchly’s test, we reported Greenhouse–Geisser corrected degrees of freedom. For a part of our dependent variables, we found that the assumption of distribution normality was not met; however, we decided to conduct parametric ANOVAs, taking into account the suggestions in literature about F-test robustness to violations of normality (e.g., Blanca et al., 2017).

Signal-detection measurement model Calculations of estimates of signal-detection parameters were performed using SDT Assistant software (Hautus, 2014). It provides maximum likelihood estimates of parameters using a quadratic convergence procedure. We assumed the unequal-variance normal model and computed d_a as the memory sensitivity parameter (Simpson & Fitter, 1973) and x_c as the response criterion location; we reported only the placement of the middle criterion, which divides the decision axis into positive (“definitely old” and “probably old”) and negative responses (“definitely new” and “probably new”). Because of a low number of trials collected from each participant, we calculated the signal-detection model parameters from pooled data (Hautus, 1997). The hypotheses about the differences between the parameters were tested using the z statistic in the manner recommended by Wickens (2002, Ch. 11.4).

Multinomial processing tree model The two high-threshold model (2HTM) is a discrete-state model which assumes that recognition memory is mediated by discrete ‘‘detect’’ and ‘‘guessing’’ states (Kellen et al., 2015). A graphical representation of the 2HT multinomial processing tree model used in the current research is depicted in Fig. 2. It was based on the version of 2HTM presented by Kellen et al., (2015, see Fig. 1). According to this model, parameter D_o represents the probability that an old item is detected, leading to a “definitely old” response (with probability s) or “probably old” response (with probability 1—s). If the old item is not detected (1—D_o), the status of the item is guessed as old, with probability g, or as new (1—g). For items guessed as old, high or low confidence “old” responses are chosen with probability a_o or (1—a_o), respectively. For undetected items guessed as new, the “definitely new” response is chosen with probability a_n, and the “probably new” response with probability (1—a_n). A new item presented at test is detected with probability D_n, leading to a ‘‘new’’ response with high confidence (parameter n) or low confidence (1—n). In the current research, we assume that both the D_n and n parameters can differ depending on the kind of lure used at test. When detection of a new item fails (1—D_n), a guessing state is entered into, which is assumed to be the same as in the case of undetected old items, and identical for all kinds of lures. The goodness of fit of the model to the empirical data was tested with the log-likelihood ratio statistic (G²), which is distributed asymptotically as a χ² distribution. At α level of 0.05, G²(1) = 3.84 indicates a critical value. Computations were carried out with the multiTree computer program (Moshagen, 2010). Some of the parameter estimates (e.g., parameter s) were close to the upper boundary of the parameter space (i.e., near 1). In such a case, the use of bootstrap simulations is recommended to draw inferences regarding the variability of the parameter estimates (Moshagen, 2010; Singmann & Kellen, 2013).

Results and discussion

At study, the participants almost perfectly answered the orienting questions (98.5% and 98.8%, of the responses were correct^{Footnote 1} for the category and colour conditions, respectively). The raw data as well as the tables presenting the hit rates and false alarm rates across response criteria for both experiments are available at https://osf.io/st6rc/.

Standard statistical analyses (ANOVA)

Mean proportions of acceptances of targets and lures across confidence levels are presented in the upper half of Table 1. For the hit rate (i.e., the proportion of “definitely old” and “probably old” responses to targets) as the dependent variable, one-way ANOVA examining the effect of encoding task (colour orienting task, category orienting task) revealed a significantly higher hit rate in the category condition (M = 0.85, SD = 0.102) than in the colour condition (M = 0.75, SD = 0.139), F(1) = 9.61, p < 0.01, η_p² = 0.16.

Table 1 Mean (SD) proportions of targets and lures classified to four confidence levels, and the mean (SD) confidence ratings of items in Experiments 1 and 2, depending on the encoding condition

Full size table

For the false alarm rate (i.e., the proportion of “definitely old” and “probably old” responses to lures) a 3 (distractor type) × 2 (encoding task) mixed ANOVA was calculated, with the distractor type (critical lure, category lure, and colour lure) manipulated within-subjects, and the encoding task (colour orienting task, category orienting task) manipulated between subjects. Main effects of distractor type, F(2) = 22.71, p < 0.001, η_p² = 0.31, and an interaction effect, F(2) = 8.86, p < 0.001, η_p² = 0.15, were revealed; however, no effect of encoding condition was observed, F(1) = 1.56. Post hoc comparisons showed significantly more false alarms for critical lures (M = 0.26, SD = 0.168) than category lures (M = 0.13, SD = 0.134), t(52) = 5.68, p_Bonf < 0.001, d = 0.78, and the colour lures (M = 0.12, SD = 0.139), t(52) = 5.32, p_Bonf < 0.001, d = 0.73, but there was no difference in false alarm rate between the category lures and the colour lures, t(52) = 0.39. No significant differences were found in false alarm rate for the category and critical lures depending on the encoding condition; however, for the colour lures, higher rate was observed in the colour condition (M = 0.19, SD = 0.143) than in the category condition (M = 0.04, SD = 0.084), t(42.17) = 4.70, p < 0.001, d = 1.29.

The results concerning confidence ratings are parallel to the results described above for hit rates and false alarm rates and they are presented in the “Appendix 1”.

Signal detection analyses

The upper part of Table 2 presents SDT parameter estimates based on data pooled over the participants. Memory sensitivity parameter d_a comparisons between the encoding conditions showed a better sensitivity in the category condition than in the colour encoding condition. In detail, in the category condition, memory sensitivity was better than in the colour condition when category lures, z = 3.14, p < 0.002, colour lures, z = 8.38, p < 0.001, and critical lures, z = 2.82, p < 0.005, were used for the calculations of the false alarm rates. The placement of the middle response criterion x_c for colour lures was significantly more conservative in the category condition than in the colour condition, z = 5.45, p < 0.001. In the case of the category lures and the critical lures, no significant differences were found in the placement of the response criteria between the encoding conditions.

Table 2 Signal detection parameter estimates (SE) based on data pooled over participants

Full size table

Two-high-threshold model analyses

Figure 3 shows bootstrapped estimates of 2HTM parameters and their standard deviations. The goodness of fit of the model to the empirical data was satisfactory, G² (5) = 5.22, p = 0.39. The D_o detection of old words parameter was significantly higher in the category condition than in the colour encoding condition, ΔG² (1) = 24.31, p < 0.001. In a similar way, the D_ncol detection parameter of the colour lures was significantly higher in the category condition than in the colour-encoding condition, ΔG² (1) = 27.77, p < 0.001. Finally, the n_col parameter representing a high confidence of “new” response tended to be higher in the category condition than in the colour condition, ΔG² (1) = 3.63, p = 0.06.

When comparing detection (D_n) across the types of lures, an interesting crossover of effects can be observed between the encoding conditions. In the category condition, colour lures were significantly better detected than category lures, ΔG² (1) = 13.26, p < 0.001, whereas in the colour condition, category lures were better detected than colour lures, ΔG² (1) = 4.49, p < 0.04. Critical lures were the worst detected in almost all conditions in comparison with both colour and category lures, ΔG² (1)s > 12.47, ps < 0.005, the only exception occurred in the colour condition, where the critical lures were not detected differently than the colour lures, ΔG² (1) = 2.03.

In sum, Experiment 1 confirmed a typical positive effect of LoP for words, that is, the category encoding task resulted in a significant increase in hit rates and decrease in false alarms rates. Both SDT and 2HT analyses also confirmed the LoP effect. Turning to the effects on false memory for specific lures, the critical lures were significantly more often falsely accepted than the colour or the category lures. However, a predicted interaction effect was also observed—while for the category lures the level of false alarms was similar in the colour and the category encoding conditions, for the colour lures, the category encoding condition resulted in a salient drop in false alarms in comparison with the colour condition. This suggests that the participants effectively rejected the lures matching in colour but inconsistent in category with the targets only when they focused their attention on categories at study. It was confirmed by the SDT analysis of the response criterion placement that the participants were more conservative in accepting colour lures in the category condition than in the colour-encoding condition, indicating an influence of retrieval monitoring. Moreover, 2HT analyses indicated that colour lures were best detected as lures in the category encoding condition and, what is more, this was done with high confidence, again suggesting the recall-to-reject monitoring.

Experiment 2

The aim of the second experiment was to demonstrate the processing-material interaction effects on false recognitions with pictorial materials. Differences in the nature of memory traces for visual and verbal materials are expected on the basis of, for example, the observation that the accuracy of visual recognition of object drawings is uncorrelated with the accuracy of recognition of the verbal labels of the same stimuli (Bahrick & Bahrick, 1971; Bahrick & Boucher, 1968). Moreover, the greater physical distinctiveness of pictures in comparison with words, suggested by the sensory-semantic model, may result in enhanced picture encoding during the perceptual orienting task (Intraub & Nicklos, 1985). As in Experiment 1, LoP was manipulated by colour or category naming orienting tasks, and lures differed according to the kind of consistent and differentiating features. In Experiment 1, our results suggested that colour lures inconsistent in category are most effectively rejected when participants attend to the category of studied words. In Experiment 2, we predicted that category lures inconsistent in colour are effectively rejected when participants attend to the colour of studied pictures.