Introduction

To place objects and events into categories requires learning to extract regularities in those objects and events. Learning the commonalities of a category allows organisms to classify similar objects and events as members of the same category. If we know that tulips are bell-shaped flowers with three petals and three sepals, then we will accurately identify as tulips any flowers sharing those characteristics and expect them to similarly grow and blossom. However, some members of a category may depart from the category’s regular characteristics. Typically, fruits have no fat; but avocado is a fruit with a high amount of fat, making it an exception.

Exceptions violate a known regularity; they decidedly differ from most members of their category. This distinctiveness seems to confer exceptions with a memory benefit. Prior studies have found exceptions to be better remembered than regular items (Palmeri & Nosofsky, 1995; Sakamoto & Love, 2004; Von Restorff, 1933), either because they are more deeply processed during learning (Fabiani & Donchin, 1995; Hunt & Lamb, 2001; Jenkins & Postman, 1948) or because other items are less likely to hinder their retrieval (Nairne, 2006; Sakamoto & Love, 2006). Despite this memory benefit, exceptions are supposed to be slower to learn. For example, according to Nosofsky et al.’s (1994) rule-plus-exception (RULEX) model, people first learn rules; when those rules fail to correctly categorize some items, people add exceptions to those rules (see also Love et al., 2004).

However, most studies have not examined the trajectory of learning regular and exception items. One laudable exception is Cook and Smith (2006), who compared humans’ and pigeons’ learning of artificial categories that included prototypes, regular items, and exceptions. They found that accuracy for exception items rose more slowly than for regular items. Critically for our interests, the same pattern held for pigeons and humans: an early tendency to categorize based on similarity, followed by memorization of exceptions.

Thus, category exceptions seem to be more difficult to learn, but to be better remembered. Yet, this conclusion derives from studies involving exceptions of only one kind, namely, those generated from the prototype of the opposite category (see Savic & Sloutsky, 2019, for extended review). That is, the exceptions shared most of their features with the members of the opposite category. We will call exceptions of this kind crossovers. One example would be dolphins; they live under water and look like fish, but are mammals. Yet, exceptions can also be entirely unique. For example, penguins are birds that do not look like birds, do not fly, and stand erect on their feet; they swim and feed under water, but they do not look much like fish. We will call exceptions of this kind oddballs: dissimilar from members of their own category, but dissimilar from members of other categories as well.

Sakamoto and Love (2006) explored the effects of different types of exceptions when human participants had to categorize lines of different colors and lengths. They found that crossover exceptions were recreated more accurately, but exceptions that were markedly different from the own or the opposite category were better recognized. More recently, Savic and Sloutsky (2019) studied adults’ and 4-year-olds’ categorization of humanoid creatures. They did not observe any recognition memory advantage for oddball exceptions compared to regular items. Moreover, for both adults and 4-year-olds, accuracy at the end of training was higher for regular than for oddball items – except when the oddballs were initially trained in isolation. Thus, it is unclear whether oddballs are benefitted or harmed by their uniqueness. To our knowledge, no studies with humans or animals have compared how different types of exceptions are learned or how they may affect the learning of regular items. These matters motivated the present study.

Pigeons excel in a variety of categorization tasks (see Lazareva & Wasserman, 2017). Importantly, Cook and Smith (2006) found that pigeons’ categorization of rule and exception items was strikingly similar to that of humans. So, here, we examined pigeons’ learning of regular and exception items involving either crossover (Crossover group) or oddball (Oddball group) exceptions.

To gain further insights into the acquisition of rules and exceptions, we specifically focused on the course of learning and not its endpoint. Cook and Smith (2006) had reported that category learning differed early and late in training; their pigeons and humans first learned the commonalities of the category exemplars and only later learned the exceptions. But the exception items in that study were only crossovers, which should obviously have taken longer to learn because they followed the rule of the opposite category. This learning disparity may not occur with oddball items. On the one hand, because the oddball items cannot benefit from the advantages of following a rule, their learning may be hindered (e.g., Wasserman et al., 1988); on the other hand, because they are highly dissimilar from any other item, interference is minimized and their learning may be hastened (e.g., Nairne, 2006).

Finally, beyond the course of learning the exception items, different types of exceptions may differently affect classification of the regular items. Most features in the crossover exceptions – but not in the oddball exceptions – are associated with one category when they are part of a regular item, but with the opposite category when they are part of the exception. Therefore, these features can activate both category responses, so that errors to regular items when the exceptions are crossover may be higher compared to when the exceptions are oddballs; consequently, correct categorization of regular items may take longer when the exceptions are crossovers.

Method

Subjects

The subjects were 16 homing pigeons (Columbia livia) maintained at 85% of their free-feeding weight. All the pigeons had served in unrelated studies prior to the present project. The 16 pigeons were randomly divided into two groups of eight pigeons each: the Crossover group and the Oddball group. All procedures were approved by the Institutional Animal Care and Use Committee at the University of Iowa.

Apparatus

The study used 36 × 36 × 41 cm operant conditioning chambers detailed by Gibson et al. (2004). The chambers were located in a dark room with continuous white noise in the background. Each chamber was equipped with a 15-in. LCD monitor located behind an AccuTouch® resistive touchscreen (Elo TouchSystems, Fremont, CA, USA). The portion of the screen that was viewable by the pigeons was 28.5 × 17.0 cm (970 × 640 pixels). Pecks to the touchscreen were processed by a serial controller board outside the box. A rotary dispenser delivered 45-mg pigeon pellets into a food cup located in the center of the rear wall opposite the touchscreen. Illumination during the experimental sessions was provided by a houselight mounted on the upper rear wall of the chamber. The pellet dispenser and houselight were controlled by a digital I/O interface board. Each chamber was controlled by its own Apple® iMac® computer. Programs to run this experiment were developed in MATLAB® with Psychtoolbox-3 extensions (Brainard, 1997; Pelli, 1997; http://psychtoolbox.org/).

Training stimuli

A total of 20 colored triangles were used to create the different category exemplars (see Fig. 1). Each of the exemplars contained five triangles forming a circular shape that occupied a 7 × 7 cm square area. We deployed the category structure used by Smith and Minda (1998) and Cook and Smith (2006). Each category had a prototype that differed in all five colors from the prototype of the other category; the prototype was never presented in training. For each category, four regular exemplars and one exception exemplar were presented. Regular exemplars shared four features from their own category prototype and one feature from the opposite category prototype. Critically, none of the features was a perfect predictor of the correct category. In the Crossover group, exception items contained only one feature from their own category prototype and four features from the opposite category prototype. In the Oddball group, exception items were unique and did not share any feature with any other stimulus.

Fig. 1
figure 1

Training exemplars for Category A and Category B. Crossover exceptions were presented only to the Crossover group, whereas oddball exceptions were presented only to the Oddball group. Regular exemplars were the same for both groups. The prototype was not presented during training

Testing stimuli

Because most features of the crossover exception belonged to the opposite category, pigeons had to learn that it was the specific combination of features in the exception what was associated with a specific category response. Thus, pigeons in the Crossover group may learn about the statistical contingencies of all of the features in the category exemplars. However, in order to classify an oddball exception, processing of one single feature is sufficient, because each feature is uniquely associated with a specific category response. Thus, pigeons in the Oddball group may learn about the statistical contingencies of only a few features.

In Test 1a, we presented items that contained either one, two, three, four, or all five features from the prototype of each category to see if the pigeons had learned about the individual features or about specific combinations of features. All possible combinations depicting the different numbers of features were given, so there were a total of 62 unique testing stimuli in Test 1a (see examples in Fig. 2, left).

Fig. 2
figure 2

Examples of testing items in Test 1a (both groups) and Test 1b (only Oddball group). Labels 1F–4F refer to one, two, three, or four features of the prototype for each category in Test 1a, and one, two, three, or four features of the oddball exception for each category in Test 1b. See text for further details

In Test 1b, pigeons in the Oddball group were tested with stimuli that contained one, two, three, or four features of the oddball exceptions to see whether the pigeons had learned about just one (that was enough to correctly classify the oddball exceptions) or more of the features. All possible combinations with the different numbers of features were given, so there were a total of 60 unique testing stimuli in Test 1b (see examples in Fig. 2, right).

Test 2 was also given to the Oddball group. We created new stimuli by combining features of the prototypes with features of the oddball exception from the same category (congruent exemplars) or with features of the oddball exception from the opposite category (incongruent exemplars). We combined either one, two, three, or four features of the prototype with either four, three, two, or one features of the oddball exception, so that a total of 120 unique testing stimuli were presented in Test 2 (see examples in Fig. 3).

Fig. 3
figure 3

Examples of congruent (combining features of the prototype with features of the oddball exception from the same category) and incongruent (combining features of the prototype with features of the oddball exception from the opposite category) testing items for Category A in Test 2. Labels 1F4F refer to one, two, three, or four features of the prototype in the testing items. See text for further details

If the oddball exception had been integrated into the category, so that it was considered a member of the category regardless of its lack of similarity to all of the other items in the same category, then accuracy for congruent exemplars should be as high as accuracy for the regular exemplars, but accuracy for incongruent exemplars should be low. On the other hand, if the oddball exception had not been integrated into the category, then neither congruent nor incongruent exemplars should be categorized accurately because they all were totally novel combinations.

Procedure

Training

Daily training sessions comprised 120 trials: half presented Category A exemplars and half presented Category B exemplars, in a random fashion. At the beginning of a trial, the pigeons were presented with a start stimulus – a white square (3 × 3 cm) in the center of the computer screen. After one peck anywhere on this white square, a category exemplar was displayed in the center of the screen. The pigeons had to satisfy an observing response requirement (number of pecks at the stimulus image; gradually increased from two to a maximum of 25 pecks on a daily basis). On completion of the observing response requirement, two report buttons appeared 4 cm to the left and right of the category exemplar. The report buttons were 2.7 × 7.2 cm rectangles filled with two distinctive black-and-white patterns. The pigeons had to peck one of the two report buttons, depending on the category presented. If the choice was correct, then food reinforcement was delivered and the intertrial interval (ITI) ensued; the ITI randomly ranged from 6 s to 10 s. If the choice was incorrect, then food was not delivered, the houselight was darkened, and a correction trial was scheduled. Correction trials were given until the correct response was made. No data were analyzed from correction trials. Pigeons were trained until they reached an accuracy level of 85% for each category for two consecutive sessions.

Test 1a

Pigeons in both groups were tested with stimuli that contained either one, two, three, four, or all five features of the prototype of each category. Each testing session began with ten warm-up training trials; the next 130 trials comprised 110 training trials plus 20 randomly interspersed testing trials. On training trials, only the correct response was reinforced; incorrect responses were followed by correction trials (differential reinforcement). On testing trials, any choice response was reinforced (nondifferential reinforcement); that is, food was given regardless of the pigeons’ choice response, so that testing could proceed without explicitly teaching the birds the correct response to the testing exemplars. No correction trials were given on testing trials. A total of ten testing sessions were given. Once Test 1a was completed, pigeons in the Crossover group ended their participation in the experiment, whereas pigeons in the Oddball group were returned to training for 7 days before starting the next testing phase.

Test 1b

Pigeons in the Oddball group were tested with stimuli that contained either one, two, three, or four features of the exception of each category. Sessions began with ten warm-up training trials and next 126 trials comprised 110 training trials plus 16 randomly interspersed testing trials. All other procedural details were identical to Test 1a. A total of ten testing sessions were given.

Test 2

Pigeons in the Oddball group were tested with stimuli that combined features of the prototype with features of the oddball exception from the same category (congruent exemplars) or from the opposite category (incongruent exemplars). Half of the testing stimuli were congruent, whereas the other half were incongruent. Equal numbers of stimuli containing one, two, three, or four features from the prototype (with the rest of the features from the exception) were presented. Sessions began with ten warm-up training trials, and the next 130 trials comprised 110 training trials plus 20 randomly interspersed testing trials. All other procedural details were identical to Tests 1a and 1b. A total of 16 testing sessions were given.

Data analysis

Training and testing data files are available (https://osf.io/c8u4p/). The data were subjected to logit mixed-effects analyses. To select an appropriate random-effects structure (only random intercepts or random slopes as well), we compared models with the same fixed-effects structure and varying complexity in their random-effects structure using the log likelihood ratio test (Wagenmakers & Farrell, 2004). All analyses were conducted using the lme4 version 1.1-21 (Bates et al., 2020) package of R, version 3.3.2 (R Development Core Team, 2016). For all analyses reported, the log-likelihood ratio tests indicated that the best fitting random effects structure included random intercepts for bird; addition of random slopes did not improve the model fits. Cohen’s d effect sizes (and their 95% confidence intervals (CIs)) are reported for critical comparisons.

Results

Training

Accuracy started near the 50% chance level in both groups and gradually increased over training sessions. The Crossover group took a mean of 35.4 (SD = 5.83) days to reach the learning criterion of 85% accuracy for each category on two consecutive days, whereas the Oddball group was much faster, reaching this criterion in a mean of 15.9 (SD = 7.77) days. A Welch two-sample t-test revealed that this difference was statistically significant, t(12.98) = 5.67, p < .001, d = 2.83, 95% CId [1.32, 4.36]. Category learning with a crossover exception proved to take much longer than with an oddball exception.

The fastest pigeon took 9 days to reach criterion; so, to further assess differences in learning speed, we compared the two groups over the first 9 days. Figure 4 shows that accuracy for both regular and exception items increased quickly in the Oddball group, reaching 83% (regular items) and 93% (exception) by Day 9. In the Crossover group, however, although accuracy for the regular items increased steadily and reached 73% by Day 9, accuracy for the exception never exceeded chance (47% by Day 9), indicating that the pigeons were more likely to classify the exception as a member of the opposite category, as expected if an organism is learning the commonalities within the categories.

Fig. 4
figure 4

Mean percentage of correct responses throughout the first 9 days of training for regular and exception items in the Crossover (left) and Oddball (right) groups. The dashed line at 50% represents the chance level of correct responses. The dashed line at 85% represents the learning criterion. Error bars indicate the standard error of the means (± 1 SEM)

We fit a logistic mixed-effects model with group (group Crossover = 0), item type (regular item = 0) and session (1–9) as fixed effects. Overall accuracy increased throughout training, B = 0.13, Z = 12.96, p < .001. Importantly, the increase in accuracy for the exception was higher in the Oddball group, B = 0.29, Z = 9.85, p < .001. The three-way interaction was statistically significant, B = 0.24, Z = 7.31, p < .001. To better evaluate learning of the regular and exception items, we conducted follow-up analyses for each group. In the Crossover group, learning for the exception proceeded more slowly than for the regular items, B = -0.15, Z = -6.96, p < .001. To the contrary, in the Oddball group, learning for the exception proceeded more quickly than for the regular items, B = 0.09, Z = 3.74, p < .001. Thus, not only is learning a unique exception faster than learning an exception similar to the opposite category, but learning a unique item is faster than learning items that follow the category rule.

Relatedly, could the type of exception also affect learning the regular items? Indeed, it did. A between-group comparison throughout the first nine training sessions revealed that the learning slope for regular items was steeper in the Oddball group than in the Crossover group, B = 0.05, Z = 3.58, p < .001. On Day 9, accuracy for the regular items was reliably higher in the Oddball group (M = 83%) than in the Crossover group (M = 73%), B = 0.64, Z = 2.50, p = .01, d = 0.24, 95% CId [0.14, 0.34]. Thus, the structure of the exceptions affected processing of the regular items.

Test 1a

Figure 5 shows that accuracy for regular items in Test 1a did not differ between groups (M = 93% and M = 92%). For the exceptions, however, accuracy was lower in the Crossover group (M = 81%) than in the Oddball group (M = 96%), indicating that the advantage of the oddball exception persisted even after extensive training.

Fig. 5
figure 5

Mean percentage of correct responses for the Crossover and Oddball groups in Test 1a. On the left, accuracy for the regular and exception training items during Test 1a. On the most right, accuracy for the prototypes that were never presented in training. Items 1F to 4F are items containing one, two, three, or four features of the prototype. The dashed line at 50% represents the chance level for correct responses. The dashed line at 85% represents the learning criterion. Error bars indicate the standard error of the means (± 1 SEM)

When pigeons were presented with the category prototypes, their accuracy was very high and similar to the regular items. As the number of features was reduced, accuracy correspondingly decreased; accuracy was still fair with two features (overall, M = 69%), but it dropped almost to chance when only 1 feature was presented (overall, M = 56%).

We fit a logistic mixed-effects model with group (group Crossover = 0) and item type (regular item = 0) as fixed effects. We did not include the exception items, because this analysis sought to evaluate use of the defining features of the category. Overall, there were no differences between groups, B = 0.02, Z = 0.11, p = .92. The only difference in accuracy was for the prototype, which was higher in the Oddball group (M = 98%) than in the Crossover group (M = 89%), B = 1.82, Z = 3.09, p = .002, d = 0.37, 95% CId [0.21, 0.52]. Thus, both groups were similarly affected by the reduction in the number of features available.

Test 1b

The Oddball group was also tested with stimuli that contained either one, two, three, or four features of the exception. Note that, for the regular items, attending to all of the features is necessary to master the task, because regular items always contain four features from the category prototype, but one feature from the opposite category. However, all of the features in the oddball exception are unique, so learning about a single feature is sufficient for correct classification. To evaluate whether the Oddball group required fewer features from the exception than from the regular items, we compared their Test 1a performance (stimuli with one, two, three, or four features from the prototype) to their performance in Test 1b (stimuli with one, two, three, or four features from the exception). Figure 6 shows that there were no differences.Footnote 1

Fig. 6
figure 6

Comparison of the mean percentage of correct responses in Test 1a (features from the prototype) and 1b (features from the exception) for the Oddball group. Items 1F–4F are items containing one, two, three, or four features of the prototype or the oddball exception. The dashed line at 50% represents the chance level for correct responses. The dashed line at 85% represents the learning criterion. Error bars indicate the standard error of the means (± 1 SEM)

A logistic mixed-effects model with item type (prototype = 0), and number of features as fixed effects revealed only an effect of the number of features, B = 0.74, Z = 10.95, p < .001. Thus, the Oddball group did not require a different number of features from the prototype and from the exception to correctly categorize these items. This similarity suggests that: (1) pigeons attended to all or almost all of the features regardless of whether or not it was necessary, and (2) the advantage of the oddball exception was not due to one feature being sufficient to correctly classify these stimuli.

Test 2

Figure 7 shows results from Test 2 in terms of prototype-consistent responses: those that agreed with the category of the correct prototype. Oddball pigeons’ prototype-consistent responses for congruent stimuli were equally high (all M > 95%) regardless of how many features of the prototype and the exception the stimuli contained. However, for incongruent stimuli, pigeons based their responding on the category to which the majority of features belonged. Performance was below chance with 1F and 2F items because the majority of features corresponded to the opposite category. Conversely, performance was above chance for 3F and 4F items because the majority of features corresponded to the correct category.

Fig. 7
figure 7

Mean percentage of prototype-consistent responses in Test 2 (given only to the Oddball group). Congruent items combined prototype and exception features from the same category, whereas Incongruent items combined prototype and exception features from the opposite category. Items could contain one, two, three, or four features of the prototype (1F–4F). The dashed line at 50% represents the chance level. The dashed line at 85% represents the learning criterion. Error bars indicate the standard error of the means (± 1 SEM)

We fit separate logistic mixed-effects models for congruent and incongruent items, with number of prototype features as the fixed effect (regular item = 0). Regular items were included for comparison purposes. For congruent items, there were no statistical differences among the testing items (all p > .05), and their accuracy did not differ from accuracy for regular items (all M > 96%). Even when all of the testing items were novel combinations, accuracy to congruent test items suffered no generalization decrement.

For incongruent items, accuracy to all testing stimuli was lower than accuracy to regular items: (M = 13%, B = -5.29, Z = -29.69, p < .001), (M = 41%, B = -3.68, Z = -28.64, p < .001), (M = 69%, B = -2.43, Z = -17.97, p < .001), and (M = 91%, B = -0.81, Z = -3.93, p < .001), with 1F, 2F, 3F, and 4F stimuli, respectively. Combining prototype features with features of the oddball from the opposite category greatly disrupted pigeons’ performance. And, the greater the number of features from the opposite exception, the greater the disruption. This result, combined with the results of Tests 1a and 1b, suggests that pigeons’ responses were controlled by most if not all of the features in the training stimuli (Castro et al., 2020).

High accuracy for all congruent items suggests that, despite the oddball exceptions being dissimilar from regular items, both regular items and oddball exceptions may have been learned as members of a coherent category. Pairing different stimuli with the same outcome often promotes the development of an equivalence class among them (e.g., Sidman, 1994; Urcuioli, 2001). Perhaps, the fact that the oddball and regular items of a given category shared the same response helped create just such an equivalence class. Indeed, several theorists consider acquired equivalence to be a key mechanism of category learning (see Urcuioli, 2001).

Alternatively, because all of the single features in the rule-following and oddball items from Category A had been associated with one specific response, whereas all of the features in both the rule-following and oddball items from Category B had been associated with the other response, there is no interference among the features of a novel congruent item. That is, all of the features in a congruent item should elicit the same response, and that may be why categorization for all of them was high. This is not the case for incongruent items; some of the features had been associated in training with the Category A response, whereas others had been associated in training with the Category B response. The graded performance that we observed with incongruent items suggests that pigeons’ categorization choices were the result of a competition process based on how many features would activate one specific response. That is, the greater the number of features that had been associated in training with a specific response, the more likely that response would be.

Discussion

Pigeons learned to classify the oddball exceptions faster than the crossover exceptions (Fig. 4). Moreover, asymptotic accuracy was higher for the oddball exceptions, even after extensive training (Fig. 5). Acquisition curves showed that this disparity was due to pigeons’ tendency to misclassify the crossover exceptions as members of the opposite category. These mistakes suggest an initial category learning strategy based on abstracting the common features of the category exemplars (Cook & Smith, 2006).

However, it does not seem to be always true that commonalities are mastered first and exceptions are added later (Cook & Smith, 2006; Nosofsky et al., 1994; Palmeri & Nosofsky, 1995). We found that this happens with crossover exceptions, but not with oddball exceptions, in which the exception decidedly differs from its own or the opposite category. Indeed, successful performance to a distinctive item may be easier to achieve than to rule-following items if that item is unique and not confusable with items from other categories. Thus, memorization of oddball exceptions can happen earlier or at the same time as abstraction of regularities in the rule-following items.

Furthermore, the disparate learning speeds for the crossover and oddball exceptions suggest that deviation from the category rule is not the reason why exceptions are often learned more slowly. Both crossover and oddball exceptions deviate from the rule, but only the crossover exception is learned more slowly. Thus, it seems that confusability – the extent to which the exception resembles members of the opposite category – rather than distinctiveness – the extent to which the exception differs from members of its own category – stunts learning of an exception item. Most likely, crossover items associatively activate the response for the opposite category (Nairne, 2006) because most of its features are more highly correlated with the opposite category; classification errors are thus prone to occur and to impede learning.

Implications for models of category learning

Many categorization models have considered exceptions in category learning, but they generally do not take into account the possible implications of various types of category exceptions. For example, although RULEX (Nosofsky et al., 1994) has proven to be successful in simulating learning and memory patterns of crossover exceptions, it is undetermined how this model would explain performance with other category structures, such as oddball exceptions. Nonetheless, RULEX could, at least in principle, be adapted to predict the acquisition patterns that we observed (Fig. 4). One possibility is that RULEX could concurrently learn the oddballs with the regular items, by forming a complex disjunctive rule which captures both the regulars and the oddballs. That way, RULEX could predict that oddball exceptions would be learned as quickly as regular items.

Also SUSTAIN (Love et al., 2004) could explain the effects in acquisition of the different kinds of exceptions. SUSTAIN generates a cluster in memory for items that share common characteristics and generates a new cluster when the current clusters prove to be inadequate for a new distinctive item. Because a cluster encoding a crossover exception will tend to be activated by similar rule-following items from the opposite category, it must become more detailed to avoid activation by the wrong items (Sakamoto & Love, 2006). Thus, it should take longer to create a stable cluster solution for categories with crossover exceptions than for categories with oddball exceptions.

Still, it is not obvious how either of these models could predict our entire pattern of results. For example, it is unclear how any form of RULEX rule-based representation could avoid underestimating the classification accuracy of items that do not contain the set of features that define the rule (Figs. 5, 6, and 7). Also, it is unclear how the current version of SUSTAIN could predict the interchangeability of features between regular and exception items (Fig. 7). To achieve high accuracy, SUSTAIN needs to form separate clusters for regular and exception items. Due to the lateral inhibition between the clusters, it should perform poorly on novel items that combine features from separate clusters representing the same category.

Importantly, we are not claiming that it would be impossible for these models to include mechanisms to explain differences in categorization performance due to various types of exceptions. Rather, we are simply pointing out that challenging questions still exist, that have not yet been addressed, as to how various types of exceptions are processed.

We should also acknowledge that, in our experiment, only categorization choice data were recorded. Further experiments in which memory is also tested, or even the brain areas involved in the processing of rule-following and exception items are monitored, should permit an even better understanding of how possible extensions of categorization models could appropriately address the processing of the different types of exceptions.

Categorization of rules and exceptions across species

Pigeons learned to correctly classify the oddball exceptions more quickly than the crossover exceptions and the regular items. Prior studies had shown that oddball exceptions yielded more accurate memory in human adults (Sakamoto & Love, 2006), but not in young children (Savic & Sloutsky, 2019). We did not test pigeons’ post-training memory for the different regular and exception items, but it would be interesting to see whether pigeons’ performance advantage with the oddball exception would also translate in better memory. Because human studies have not paid much attention to the trajectory of learning regular and exception items, it would also be interesting to see whether performance during the acquisition stage is better or worse for regular items and the different types of exceptions in adults and young children.

Although pigeons’ categorization abilities have proved to be remarkable (e.g., Lazareva & Wasserman, 2017), even comparable to humans’ in many cases (e.g., Cook & Smith, 2006), important discrepancies between species have been observed as well. For example, when categorizing various stimuli, human adults tend to attend selectively; that is, they focus their attention on a single stimulus feature or dimension and deploy unidimensional rules or strategies (e.g., Ahn & Medin, 1992; Kemler Nelson, 1984; Nosofsky et al., 1994; Regehr & Brooks, 1995). Pigeons may also use single features to categorize stimuli (e.g., Castro & Wasserman, 2014; Lea & Wills, 2008; Nicholls et al., 2011). However, their attention tends to be distributed rather than focused on specific diagnostic features (Castro et al., 2020); this is also the case for young children (Best et al., 2013; Deng & Sloutsky, 2016). How these similarities and differences may affect category learning, performance, and memory of rules and exceptions with different category structures is a challenging, but exciting opportunity for future research. Comparison among different species – as well as through different stages of development – may offer considerable insights into the evolutionary roots of categorization and the nature of cognition with and without language.

Conclusions

Not all exceptions are created equal – neither for humans (Sakamoto & Love, 2006; Savic & Sloutsky, 2019) nor for pigeons. Our findings represent an important step in understanding the role of exceptions in category learning. Different kinds of exceptions are learned at different rates. Moreover, different kinds of exceptions can also differently affect processing of category regularities. Deviation from the category rule per se does not explain why exceptions are typically more difficult to learn. Rather, confusability with members of the opposite category hinders learning, but uniqueness facilitates it.