Selective and distributed attention in human and pigeon category learning☆
Introduction
If it has fur, then it must be a mammal; if it has feathers, then it must be a bird. This kind of reasoning is typical of our daily inductive inferences. Indeed, in order to classify numerous objects and events, an organism must perceive and attend to those features that are common to exemplars of one category and that distinguish this category from all others. Thus, it seems logical to say that, when humans and nonhuman animals learn to categorize diverse stimuli, it is advantageous for them to focus on those features that are relevant to mastering the task.
Attention figures prominently in many models of human categorization, including exemplar models (Kruschke, 1992; Nosofsky, 1986, Nosofsky, 1992), prototype models (e.g., Smith & Minda, 1998), and clustering models (e.g., Love et al., 2004), as well as in various animal associative learning models (e.g., George & Pearce, 2012; Mackintosh, 1965, Mackintosh, 1975). According to these accounts, attention is pliable; it tends to be distributed along multiple stimulus dimensions at the beginning of learning, but it converges on the most relevant dimensions as learning proceeds. In addition, deploying attention is assumed to help optimize performance, particularly when categories can be distinguished on the basis of only a few dimensions. For example, to discriminate between squirrels and chipmunks, one should shift attention to stripes (diagnostic feature) and away from the tail or fur (nondiagnostic features).
Although it is entirely reasonable for organisms to focus their attention on the stimulus feature(s) conveying information that is relevant to solving a categorization task, it is important to appreciate that a category discrimination can also be accomplished by perceiving the overall similarity or family resemblance of the exemplars in each category, so that attention may become more widely distributed among multiple features. It has often been observed that, under explicit, intentional classification and learning conditions, healthy human adults tend to use a single deterministic dimension; however, they tend to rely on multiple probabilistic dimensions that contribute to overall exemplar similarity under implicit learning conditions (e.g., Kemler Nelson, 1984; Love, 2002; Waldron & Ashby, 2001).
Learning under those different circumstances may involve separate mechanisms, as suggested by COVIS, the categorization theory proposed by Ashby and his colleagues (Ashby et al., 1998; Ashby & Valentin, 2005; Ashby & Waldron, 1999). According to COVIS, category learning may be accomplished by two different systems: 1) a frontal-based explicit system that uses language and logical reasoning, and allows the organism to learn relatively quickly, and 2) a basal ganglia-mediated implicit system that involves procedural learning, and results in learning taking place slowly, in an incremental fashion, and being highly dependent on immediate feedback. Correspondingly, these two systems involve different types of attention. Attention is hypothesis driven under the explicit mechanism, whereas attention is stimulus-driven by the contingencies of reinforcement under the implicit mechanism; that is, stimulus features or dimensions are differentially weighted based on their capacity to predict the correct category. As it has become the custom in the human cognition literature, we will reserve the term selective attention for the top-down, hypothesis-driven type of attention.
Early in human development, categories can be learned without engaging selective attention (e.g., Best et al., 2013). Infants can learn the statistical co-occurrence of several features within the exemplars of different categories; thus, infants' attention tends to be distributed rather than focused on specific diagnostic features. In a later study, Deng and Sloutsky (2016) gave 4-year-olds, 7-year-olds, and adults a category learning task in which there was a single rule-like deterministic feature that perfectly predicted category membership accompanied by multiple probabilistic features that only probabilistically predicted category membership (a paradigm developed by Kemler Nelson, 1984). After training, participants' categorization behavior and memory were tested in order to identify which features controlled participants' categorization choices and how well these features were remembered. When the instructions directed participants' attention to both the deterministic and probabilistic features (Experiment 1), adults and 7-year-olds tended to rely on the deterministic feature, whereas 4-year-olds relied on the probabilistic features. The 4-year-olds could and did rely on the deterministic feature when the instructions directed them to it (Experiment 2), just as did the 7-year-olds and adults. Yet, even when the categorization choices of children and adults were based on the deterministic feature, their memories were strikingly different. Consistent with their engagement of selective attention, the 7-year-olds and adults exhibited robust memory for the deterministic feature, but not for the probabilistic features; in contrast, the 4-year-olds remembered all of the features equally well, at odds with the idea of selective attention. Thus, older children's and adults' memory pointed to selective attention during category learning, whereas younger children's memory suggested more distributed attention.
Following Ashby and colleagues' theoretical proposal (Ashby et al., 1998; see also, Cincotta & Seger, 2007; Sloutsky, 2010), one possible explanation for this discrepancy between very young children and adults is that selective attention requires the active involvement of brain structures mediating what is called executive function: specifically, the prefrontal cortex (PFC), which is immature early in development. From this perspective, selective attention in category learning requires the participation of a fully developed and functional PFC, whereas similarity-based categorization can be accomplished by more primitive brain regions, such as the inferotemporal cortex and the basal ganglia. If a fully developed and functional human PFC is required to exhibit selective attention, then animals that have a less well-developed PFC or that do not have this structure at all may show an attentional pattern more similar to that of young children who have an immature PFC.
Given these premises, Couchman, Coutinho, and Smith (2010) explored whether humans and monkeys (Macaca mulatta) would rely on a single deterministic predictor or on several probabilistic predictors when learning to discriminate two artificial categories. Because monkeys have proportionally smaller frontal cortices than humans (Semendeferi et al., 2002), it was suspected that they might not show the same attention capacities as do humans. Human adults were trained under either explicit or implicit learning conditions, with the explicit condition being more directly comparable to the task given to monkeys. The prediction was that, as in earlier studies (e.g., Kemler Nelson, 1984), human adults would strongly rely on the deterministic predictor in the explicit condition, but strongly rely on the probabilistic predictors in the implicit condition. However, only 58% of the human participants relied on the single perfect predictor in the explicit condition (36% relied on the probabilistic features in the implicit condition). Two monkeys were given the same basic explicit task in a pair of experiments. The first experiment suffered from low categorization accuracy during the testing phase. The second experiment, with the same two monkeys, required several procedural modifications to improve their testing accuracy; its results suggested that the monkeys were distributing their attention among all of the features in the categorization stimuli. It seems safe to conclude from this study that humans were inclined to use to the deterministic information, whereas monkeys were more inclined to use the probabilistic information, although the results of the study were not straightforward.
A subsequent study addressing the same issue with pigeons was conducted by Nicholls et al. (2011). Pigeons do not have a PFC, so their tendency to distribute attention and to rely on all of the available features when learning to categorize complex stimuli may be even more clearcut. Indeed, some researchers have contended that pigeons lack the capacity for rule formation and selective attention (Smith et al., 2012); if so, then pigeons' excellent categorization performance may actually be based on their recognition of the overall similarity among the stimuli belonging to the same category rather than on their deployment of selective attention to the most diagnostic information. In Nicholls et al. (2011, Experiment 2), pigeons were shown artificial categories in which each exemplar was created from four spatially separated features: one was a perfect predictor, whereas the other three were probabilistic predictors (using only one feature could yield 75% accuracy; using all three features was required to reach 100% accuracy). When, in testing, the perfect and probabilistic predictors were put into conflict, most pigeons relied on the perfect predictor to classify the stimuli. Interestingly, those pigeons that did not, also focused on one feature, just not the perfect predictor. So, overall, pigeons' categorization behavior was controlled by only one feature (see also Lea & Wills, 2008, Lea et al., 2009, and Wills et al., 2009, for further results and discussion).
Categorization controlled by a single feature suggests selective attention, because that single feature must be preferentially processed amid all of the available features. More explicit evidence implicating attention to specific features in pigeons' categorization was provided Castro and Wasserman (2014). In that study, pigeons were trained to classify stimuli from two different artificial categories, in which the exemplars contained both relevant (perfect predictors) and irrelevant features. Because tracking of peck location—similar to human eye tracking (e.g., Rehder & Hoffman, 2005)—is a promising proxy for measuring pigeons' allocation of visual attention, Castro and Wasserman required their pigeons to peck anywhere at the category exemplars when they were presented on a computer screen (see also Dittrich et al., 2010). The authors found that, as pigeons' categorization accuracy progressively rose, so too did their pecks to the relevant category features; conversely, pigeons' pecks to the irrelevant category features progressively fell. In short, as pigeons were learning to categorize the stimuli, they also seemed to learn to attend preferentially to the relevant stimulus features (see also Castro and Wasserman, 2016a, Castro and Wasserman, 2017).
Although pigeons do not have a PFC—and their pallium is nucleated and lacks the distinctive laminar organization observed in the mammalian cortex (Jarvis et al., 2005) —there are noteworthy parallels between the avian and the mammalian forebrains at the connectivity level. For example, birds' forebrains are both modular, small-world networks with a connective core of hub nodes that includes structures (e.g., the nidopallium caudolaterale) similar to the mammalian PFC. These hub nodes are centrally located and richly connected (Shanahan et al., 2013). It is thus conceivable that pigeons' brain characteristics are sufficient to allow them to solve categorization tasks akin to those solved by human adults (see Lazareva & Wasserman, 2010, for a comprehensive review).
Considering the prior empirical findings and theoretical analyses, we are left with two related, but unanswered questions. First, do animals lacking a mature mammalian PFC learn categorization tasks by selectively attending to the most predictive information (e.g., Castro & Wasserman, 2014; Nicholls et al., 2011) or do they tend to distribute their attention in the process of category learning so that their usage of the most predictive features is merely the result of those features acquiring high associative strength because they are strong predictors of reinforcement (e.g., Ashby et al., 1998; Couchman et al., 2010; Smith et al., 2012)? Second, to what extent is the role of attention in category learning similar in animals and humans?
To answer these questions, we deployed a category learning paradigm similar to that of Deng and Sloutsky (2016) with both human adults (Experiment 1) and pigeons (Experiment 2), to better understand how these different species attend to and process the available information for solving a categorization task. We also used computational modeling to determine people's and pigeons' attentional profiles in categorization testing. We expected that human adults would be prone to optimize attention and, thus, to focus on the deterministic feature of the category exemplars. At greater issue was pigeons' attentional performance. Would they too optimize their attention and focus on the deterministic feature? Or would they attend more diffusely, relying as well on the probabilistic features, in line with the behavior of very young children?
Section snippets
Experiment 1
In Experiment 1, human participants had to learn to categorize exemplars belonging to two categories. Each category had a prototype that was completely different in seven features from the prototype of the other category. The prototype itself was never presented in training, but the training exemplars highly matched the prototype (see Fig. 1). All of the training exemplars contained one deterministic feature that perfectly distinguished the two categories (e.g., the circle in the center of the
Experiment 2
Next, we studied the categorization response patterns of a very different species. Pigeons had to learn to categorize exemplars belonging to the same two categories as in Experiment 1. The experimental design and category structure for training and testing exemplars were the same as in Experiment 1. The only disparities were related to the different regimens required to study both species.
Computational modeling
We sought to determine to what extent people's and pigeons' attention was selective and focused on a single feature (presumably the deterministic feature) or distributed across some or all of the features in the stimuli. To do so, we modelled both species' categorization choices during testing to infer utilization scores for each feature (Macho, 1997).
A suitable modeling tool to better understand humans' and pigeons' performance is Nosofsky, 1986, Nosofsky, 1988, Nosofsky, 1992 Generalized
General discussion
In the reported experiments, human adults and pigeons mastered (terminal accuracy surpassing 90% correct) a categorization task that could be solved by selectively attending to a single deterministic feature or by distributing attention across multiple probabilistic features. When only the deterministic feature was available, both human adults' and pigeons' accuracy was well above chance (see Only-D trials in Fig. 2), but humans' accuracy was much higher and did not suffer any decrement
CRediT authorship contribution statement
Leyre Castro:Conceptualization, Methodology, Investigation, Formal analysis, Visualization, Writing - original draft.Olivera Savic:Conceptualization, Methodology, Visualization, Writing - review & editing.Victor Navarro:Formal analysis, Visualization, Writing - review & editing.Vladimir M. Sloutsky:Conceptualization, Methodology, Visualization, Writing - review & editing.Edward A. Wasserman:Conceptualization, Methodology, Visualization, Writing - review & editing.
References (73)
- et al.
A two-stage model of category construction
Cognitive Science
(1992) - et al.
Working memory updating and the development of rule-guided behavior
Cognition
(2014) - et al.
The neurobiology of human category learning
Trends in Cognitive Sciences
(2001) - et al.
Multiple systems of perceptual category learning: Theory and cognitive tests
- et al.
Errors, efficiency, and the interplay between attention and category learning
Cognition
(2009) - et al.
Attentional shifts in categorization learning: Perseveration but not learned irrelevance
Behavioural Processes
(2016) - et al.
Executive control and task switching in pigeons
Cognition
(2016) - et al.
Development of cognitive control and executive functions from 4 to 13 years: Evidence from manipulations of memory, inhibition, and task switching
Neuropsychologia
(2006) - et al.
Selective attention, diffused attention, and the development of categorization
Cognitive Psychology
(2016) - et al.
The prefrontal “cortex” in the pigeon. Biochemical evidence
Brain Research
(1985)
Dissociable mechanisms underlying individual differences in visual working memory capacity
NeuroImage
The avian ‘prefrontal cortex’ and cognition
Current Opinion in Neurobiology
Categorical data analysis: Away from ANOVAs (transformation or not) and towards logit mixed models
Journal of Memory and Language
The effect of intention on what concepts are acquired
Journal of Verbal Learning and Verbal Behavior
Strength of response suppression to distracter stimuli determines attentional-filtering performance in primate prefrontal neurons
Neuron
Inside the corvid brain—Probing the physiology of cognition in crows
Current Opinion in Behavioral Sciences
Color mixing in the pigeon (Columba livia) II: A psychophysical determination in the middle, short and near-UV wavelength range
Vision Research
Eyetracking and selective attention in category learning
Cognitive Psychology
Implicit and explicit categorization: A tale of four species
Neuroscience & Biobehavioral Reviews
The dopaminergic innervation of the pigeon caudolateral forebrain: Immunocytochemical evidence for a “prefrontal cortex” in birds?
Brain Research
Flexible rule use: Common neural substrates in children and adults
Developmental Cognitive Neuroscience
A neuropsychological theory of multiple systems in category learning
Psychological Review
On the nature of implicit categorization
Psychonomic Bulletin & Review
The cost of selective attention in category learning: Developmental differences between adults and infants
Journal of Experimental Child Psychology
The psychophysics toolbox
Spatial Vision
Homology, neocortex, and the evolution of developmental mechanisms
Science
Pigeons’ tracking of relevant attributes in categorization learning
Journal of Experimental Psychology: Animal Learning and Cognition
Feature predictiveness and selective attention in pigeons’ categorization learning
Journal of Experimental Psychology: Animal Learning and Cognition
Dissociation between striatal regions while learning to categorize via feedback and via observation
Journal of Cognitive Neuroscience
Control of goal-directed and stimulus-driven attention in the brain
Nature Reviews Neuroscience
Rules and resemblance: Their changing balance in the category learning of humans (Homo sapiens) and monkeys (Macaca mulatta)
Journal of Experimental Psychology: Animal Behavior Processes
Peck tracking: A method for localizing critical features within complex pictures for pigeons
Animal Cognition
Wavelength discrimination in the “visible” and ultraviolet spectrum by pigeons
Journal of Comparative Physiology A
A configural theory of attention and associative learning
Learning & Behavior
Recent advances in operant conditioning technology: A versatile and affordable computerized touch screen system
Behavior Research Methods, Instruments and Computers
Cited by (13)
Using multimodal learning analytics to model students’ learning behavior in animated programming classroom
2023, Education and Information Technologies
- ☆
This research was supported by National Institutes of Health Grants R01HD078545 to VMS and P01HD080679 to VMS and EAW.