Introduction

A topic of great interest in emotion research has been the universality of human emotions (e.g., Ekman, 1992). Studies of different emotion-related phenomena, such as facial expressions, have concluded that there is a set of emotions which are recognized and expressed similarly in most cultures (see Shaver, Murdaya, & Fraley, 2001, for an overview). Other researchers, however, have emphasized cross-cultural differences (e.g., Levy, 1984).

One approach to this field is the study of the lexicon of emotion terms, that is, the words that describe or relate to human emotions (i.e. “emotion words”). Since the last decades of the past century, there have been several attempts to elaborate comprehensive corpora that include all those emotion words. The earliest studies revealed a wide range on the number of words referred to emotions in distinct languages. Indeed, in a review of that literature, Russell (1991) reported an emotion lexicon of 2000 words in English, 1501 words in Dutch, and 750 words in Chinese. Such discrepancies may be partly due to differences in the methods used to elaborate those emotion lexicons; therefore, normative data, which are obtained following standardized procedures, are needed to allow cross-cultural research (Bates et al., 2001).

Some studies in the field have relied on dictionaries to extract emotion words. This is done manually by a group of coders who follow a set of criteria to categorize a word as an emotion-label word (Pavlenko, 2008). Emotion-label words, or simply emotion words, are those denoting emotions, such as anger, which are the focus of the present study. An example of the mentioned approach is the Annotated Lexicon of Chinese words (Ng et al., 2019), which contains 953 emotion terms. This is a costly procedure, which involves considerable manual effort, and requires the implementation of inter-judge agreement protocols. Apart from that, although such approach can yield a large number of emotion words, some of them may be of such low frequency that most speakers may not know their meaning. Indeed, it is important to distinguish between the linguistic characterization of the emotion lexicon of a particular language and the psychological characterization of that lexicon (i.e., the number of words used and considered as emotion words by the speakers).

To capture this psychological characterization, other approaches have been developed from subjective perception of those words. This is the case of the free-listing method, which consists of asking native speakers to produce as many emotion terms as they could in a short period of time (e.g., Frijda et al., 1995; Romney et al., 1997; Schrauf & Sanchez, 2004; Van Goozen & Frijda, 1993). It should be noted, however, that this method does not provide a representative sample of an individual’s entire emotion lexicon. Indeed, in such a time-constrained situation, participants produce the words which can be more easily retrieved from long term memory: those that are used more frequently, or that are more psychologically/culturally salient (Schrauf & Sanchez, 2004). Such limitation may be overcome by providing participants with a large list of potential emotion terms and asking them to rate the extent to which each of them refers to an emotion. This approach has been followed in several studies, inspired by Rosch’s (1978) work. According to this methodology, concepts for emotions, and their corresponding words, are prototypically organized (e.g., Fehr & Russell, 1984; Ortony et al., 1987; Shaver et al., 1987), similarly to other categories such as colors or physical objects. This means that emotional concepts are not defined in terms of a set of necessary and sufficient attributes, but rather vary in the degree or typicality to which they represent a particular emotion. In this fashion, the most prototypical or central exemplars are those that come more easily to mind and that are categorized faster as denoting emotions. For instance, Fehr and Russell (1984) identified concepts such as fear, sadness and anger as central exemplars of emotions in English, whereas concepts such as anxiety, disgust, or pride were found to be less central. The present work follows this prototype approach to typify the emotion lexicon in the Spanish languageFootnote 1.

The prototype approach has been followed by several studies in the last decades (e.g., Alonso-Arbiol et al., 2006; Galati, Sini, Tinti, & Testa, 2008; Niedenthal et al., 2004; Shaver, Schwartz, Kirson, & O’Connor, 1987; Shaver, Murdaya, & Fraley, 2001; Zammuner, 1998). Some of them have endeavored to identify the features or dimensions that contribute to emotion prototypicality. For instance, Zammuner (1998) reported prototypicality ratings for a list of 153 Italian emotion terms, as well as ratings for valence, intensity, and duration. Zammuner conceived valence as the hedonic charge of a word; that is, the degree to which it denotes an affective experience (regardless of whether it is pleasant or unpleasant). On the other hand, intensity was defined as the amount of the affective experience, and duration as the time that the emotional experience denoted by the term last at most. In a series of regression analyses, it was found that the three variables were significant predictors of prototypicality. Hence, Zammuner’s findings showed that the more hedonically saturated (either pleasant or unpleasant), intense and brief experience denoted by the word is, the more prototypical the word is considered to be. More recently, Niedenthal et al. (2004) carried out an emotion prototype analysis of 237 French words in which they also collected ratings for other word features, such as subjective age of acquisition or frequency. The authors found that these two variables were related to prototypicality, with higher frequency emotion words and those acquired earlier in life denoting the most prototypical emotional states. However, the effects of age of acquisition were mediated by frequency (note that both variables are highly correlated), leaving open the question of whether there is an independent age of acquisition effect. Additionally, and in line with Zammuner’s results, regression analyses revealed that the most important predictor of prototypicality was intensity.

The main aim of the present study is to provide prototypicality ratings for a large set of words. This is the first database of this kind in the Spanish language and the largest study using a prototypical approach in any language so far. Also of note is the inclusion of different grammatical categories in the dataset. Previous studies focused on either nouns (e.g., Alonso-Arbiol et al., 2006; Shaver et al., 1987, 2001) or adjectives (e.g., Galati et al., 2008). The rationale for using only nouns was that they increase the similarity of emotions to the objects used by Rosch (Shaver et al., 2001). By contrast, those who focused on adjectives argued that they are more easily associated with immediate emotional experiences (Galati et al., 2008). We decided to include nouns, adjectives, and verbs in order to cover the whole range of words denoting emotions and to give researchers as many options as possible to select materials for their experiments. Additionally, when available, we collected prototypicality ratings for nouns, adjectives and verbs referring to the same emotion (e.g., tristeza, triste and entristecerse; sadness, sad and sadden, respectively). In this way, we were able to examine if words in certain grammatical categories are more emotionally charged than others, which would have consequences for psycholinguistic experiments involving emotion words. Apart from this methodological reason, there were also more theoretical reasons to include words belonging to different grammatical categories in the dataset. Concretely, there could be differences across languages in the type of words that are more closely associated with emotions. For instance, using the free-listing method, Romney et al. (1997) found that English native speakers produced mostly nouns, while Japanese speakers produced mostly adjectives. Furthermore, it has been shown that other languages, such as Russian, tend to lexicalize emotions in verbs (Pavlenko, 2002). The preponderance of a type of word to express emotions is informative of the status of emotions in that particular language. Indeed, cultures where emotions are conceived as focused on the state of the individual (i.e., individualistic cultures) tend to use nouns and adjectives, while cultures where emotions are conceived more as interpersonal processes or relations (i.e., collectivistic cultures) have a preponderance of verbs in their emotion lexicon (Pavlenko, 2008). Finally, psycholinguistic research has evidenced a modulation of syntactic processes by emotional features. For instance, Palazova et al. (2011) found an interaction between emotional features and grammatical class at the word level. However, as suggested by Hinojosa et al. (2019), there is a need for further confirmation on how the emotional features of words interact with word category information.

The second aim of the present study is to identify the features that contribute the most to emotional prototypicality by examining the role of several variables not considered in previous studies. Those variables were chosen because of their relevance according to the most influential models of the human affective space (i.e., dimensional models and discrete emotions models), which have inspired most of the psycholinguistic research conducted on emotional language processing so far.

Dimensional models of emotion propose that the human affective experience can be described in terms of continuous variations of a small number of dimensions, the most relevant being valence and arousal (Bradley & Lang, 2000). In these models, valence is defined as the extent to which an affective experience is pleasant or unpleasant, while arousal refers to the degree of activation it entails. Dimensional models have dominated psycholinguistic research on the effects of affective content of words on language processing, with evidence that both valence and arousal modulate such effects (see, for instance, Hinojosa et al., 2019, for a review). The other main theoretical proposal in the field is the so-called “discrete emotions” approach, which describes the human affective space in relation to a discrete number of emotions with specific characteristics, physiological correlates, behavioral action tendencies, and associated emotional experiences (e.g., Ekman, 1992; Panksepp, 1998). Recently, a few psycholinguistic studies have shown that the extent to which words are related to discrete emotions may affect their processing as well (e.g., Briesemeister et al., 2011; Briesemeister et al., 2015; Ferré, Haro, & Hinojosa, 2018; Silva et al., 2012).

We can therefore glean that psycholinguistic research on emotion and language has taken into account the dimensions of valence and arousal, and, to a lesser extent, discrete emotions. However, all those variables were not considered in past studies on emotion prototypicality. As previously mentioned, those studies highlighted the role of valence and intensity. It should be noted that these dimensions do not exactly correspond to “valence” and “arousal” as described by Bradley and Lang (2000). “Valence” means hedonic load (regardless of its polarity) for Zammuner (1998) and Niedenthal et al. (2004), while it precisely refers to the polarity of experience (i.e., pleasant or unpleasant) in dimensional models. Intensity is conceived by Zammuner and by Niedenthal et al. as a dimension that goes beyond arousal; in line with Frijda et al. (1992), they consider it to be a complex dimension that involves changes in the level of arousal/activation, but also in other components of the emotional experience, such as in the impulse to act. Hence, there is a need to assess the role of valence and arousal, as well as of discrete emotions, in order to have a better characterization of emotion prototypicality vis a vis the most influential models in the field. This is what we have done in this study, where we included ratings for valence, arousal, as well as for emotionality (i.e., the hedonic load of a word, regardless its polarity). This last variable closely corresponds to the “valence” variable assessed by Zammuner (1998) and Niedenthal et al. (2004). Including emotionality allowed us to contrast its role with that of valence on the prototypicality of emotion words. We also examined discrete emotions. Including them in this study enabled us to examine if words related with particular emotions are more prototypical than those related with other emotions, an issue never addressed before. Furthermore, characterizing emotion words in relation to dimensional and discrete emotion theories may contribute to build bridges between the extensive literature on affective word processing and the scarce studies on the dimensions that contribute to the affective load of words.

Apart from affective variables, we also investigated the role of other psycholinguistic variables, in particular: concreteness, frequency, and age of acquisition (AoA). We examined concreteness because some studies (Kousta et al., 2011) suggest that there is a relationship between words’ concreteness and their emotionality, in that abstract words would be more affectively loaded than concrete words (see, for instance, Ferré, Anglada-Tort, & Guasch, 2018; Hinojosa et al., 2014; Moffat et al., 2015, for evidence in favor of that proposal, but see Palazova, 2014, for a review reporting contrasting findings). Including concreteness in this study enabled us to examine for the first time whether, among words referred to emotions, abstractness is related with prototypicality. Moreover, we included frequency and AoA because Niedenthal et al. (2004) found that both variables were related with prototypicality in French. Our purpose was to examine those relationships in Spanish.

To sum up, our purpose was to develop the first large database on the Spanish emotion lexicon based on a prototype approach. To this end, we collected prototypicality ratings for all the words included in the dataset through questionnaires. We also aimed to identify the variables which contribute to the prototypicality of words referred to emotions. We considered a set of affective and psycholinguistic variables: valence, arousal, emotionality, happiness, sadness, fear, disgust, anger, age of acquisition, frequency and concreteness. Regarding these variables, we obtained the data from published normative studies whenever available and collected our own ratings for the remaining words. This dataset will facilitate the selection of experimental materials for psycholinguistic experiments. There are already published databases from which researchers interested in the relation between language and emotion select their materials (e.g., Ferré et al., 2012; Guasch et al., 2016; Hinojosa, Martínez-García et al., 2016; Stadthagen-Gonzalez et al., 2017; Warriner et al., 2013; see Fraga et al., 2018 for a web-based search engine for Spanish words). However, those datasets were created to be used as stimuli databases and they do not necessarily contain all the emotion terms in each language. In fact, they do not distinguish between emotion words and emotion-laden words (i.e., those that provoke emotional reactions although they do not denote any emotion, e.g., murder). A few recent studies have compared these two classifications, reporting mixed findings (e.g., Martin & Altarriba, 2017; Wang et al., 2019; Zhang et al., 2018). However, it should be noted that the distinction between emotion words and emotion-laden words was done intuitively in those studies. Prototypicality ratings included in this study will provide researchers with a measure to select the most representative emotion words to be compared with emotion-laden words. Our large database of emotion words in Spanish will prove useful also in comparative analyses of emotion words across languages and cultures. Indeed, once the emotion lexicon (and its most prototypical words) is identified in different languages and cultures, comparative analyses can be performed in order to know if the words that are considered as more prototypical exemplars of the emotion category are the same or vary from language (culture) to language (culture). The dataset presented here also has practical applications; as Zammuner (1998) pointed out, emotion prototypicality can be informative in examining the acquisition of emotion words (concepts) in children, to select the words to include in checklists and inventories aimed to assess people’s emotional states or skills, such as The Emotional Quotient Inventory (EQ-I; Bar-On, 1997), or The State-Trait Anxiety Inventory (STAI; Spielberger et al., 1983), as well as for a better characterization of the emotion words produced by people in narrative therapies (Pennebaker & Seagal, 1999), among others.

Method

Participants

We collected ratings from a total of 1127 native speakers of Spanish. Their mean age was 21.9 years (range, 19–63; SD: 1.5) and 949 of them (84.2%) were women. Most participants were students at universities in different regions of Spain, namely Universitat Rovira i Virgili, Tarragona (436), Universidad de Murcia (385), Universidad Complutense de Madrid (138), Universidade de Santiago de Compostela (123), and others (45). They completed up to five questionnaires in exchange for academic credit or as volunteers. Participants who completed more than one questionnaire did not repeat the same set of words.

Materials

We selected a large set of potential emotion words following these steps: First, we translated into Spanish over 700 words used in previous studies on emotion words and emotion prototypicality (e.g., asombro, astonishment in English) conducted in English (Altarriba et al., 1999; Bauer & Altarriba, 2008; Johnson-Laird & Oatley, 1989; Kazanas & Altarriba, 2016), French (Niedenthal et al., 2004) and Italian (Zammuner, 1998). Second, we selected over 800 additional words. Some of them were synonyms of the translations included in the first step, (e.g., admiración, amazement in English, as a word very closely related in meaning to asombro), while others were morphologically modified words (e.g., asombrar and admirar as verbs), including pronominal verbs (e.g., asombrarse) and participles used as adjectives in Spanish (e.g., asombrado), and yet others were selected from different affective norms in Spanish (e.g., Ferré et al., 2012; Guasch et al., 2016; Redondo et al., 2005; Redondo et al., 2007). Finally, each of the authors evaluated whether each of the preselected words fit into the definition of emotion word provided by Pavlenko (2008), namely: “…words that directly refer to particular affective states (“happy”, “angry”) or processes (“to worry”, “to rage”), and function to either describe (“she is sad”) or express them (“I feel sad”) ” (p. 148). Those words evaluated as emotion words by a majority of the authors (i.e., at least 4 out 7) were included in the final list of emotion words. This set comprised 1287 words, among which 449 words (34.9%) were nouns, 552 words (42.9%) were adjectives, and 286 words (22.2%) were verbs.

Procedure

We collected prototypicality ratings through questionnaires for the whole set of 1287 words. Regarding the other variables, there were already normative values for some, but not all, the words in our list (Alonso, Díez, & Fernández, 2016; Alonso, Fernández, & Díez, 2015; Davies et al., 2013; Duchon et al., 2013; Ferré et al., 2017; Guasch et al., 2016; Hinojosa, Martínez-García et al., 2016; Hinojosa, Rincón-Pérez et al., 2016; Redondo et al., 2005; Stadthagen-Gonzalez et al., 2017, 2018), so we used questionnaires to collect ratings for the words for which ratings were not available from previous normative studies. In the questionnaires, we also incorporated some words already rated in those databases as “control” words for validation purposes (see the Results section). Overall, including the new words and the control words, questionnaires included 743 words for valence, 746 for arousal, 746 for happiness, 746 for sadness, 744 for fear, 739 for disgust, 748 for anger, 1056 for AoA, and 1102 for concreteness ratings. Several filler words were also taken from the same datasets for the AoA and concreteness variables in order to cover the entire range of the rating scale. Finally, all questionnaires started with seven calibrator words (selected from the normative data, too) to expose participants to the full range of the scale before rating the critical items. Those calibrators included two words with low scores, three words with medium scores, and two words with high scores in each respective scale. Since there were no previous prototypicality databases in Spanish, we used as calibrator words for that variable the Spanish translation equivalents of seven words selected from Niedenthal et al.’s (2004) ratings for French words and Zammuner’s (1998) ratings for Italian emotion words.

The word set was randomly divided into eight prototypicality questionnaires with 167 words each. For the other variables, there were five questionnaires with 153–155 words each for valence, arousal, happiness, sadness, anger, fear and disgust; five AoA questionnaires with 236 words each; and five concreteness questionnaires with 246 words each. The order of presentation for all words in a given questionnaire was randomized individually for each participant (except the calibrators which were always presented first). Participants accessed the questionnaires online through Qualtrics (http://www.qualtrics.com). They first provided informed consent and completed a few demographic questions and were then given written instructions for the target variable. The instructions for prototypicality were adapted from those by Niedenthal et al. (2004) and Zammuner (1998): We asked participants to rate to what extent does a word refer to an emotion. They performed their ratings by clicking on a five-point scale ranging from 1 = “Esta palabra no se refiere a una emoción” (This word does not refer to an emotion) to 5 = “Esta palabra se refiere claramente a una emoción” (This word clearly refers to an emotion). The exact wordings of the instructions in Spanish as well as an English translation are provided in Appendix 1. Regarding the other affective variables, we used the same instructions and scale for valence and arousal as Stadthagen-Gonzalez et al. (2017). Participants rated each word’s valence and arousal on a 1-to-9 scale (1= unhappy, 9 = happy, for the valence scale, and 1 = quiet and 9 = excited, for the arousal scale). We also used the same instructions as Stadthagen-González et al. (2018) for discrete emotions, with participants rating each word on a 1-to-5 scale in relation to happiness, sadness, fear, disgust, and anger (1 = not at all, 5 = extremely). Concerning the psycholinguistic variables, the instructions and the 11-point scale for AoA were the same as those used by Alonso et al. (2015), where 1 = AoA lower than 2-years-old and 11 = AoA equal or higher than 11-years-old. The instructions and the seven-point scale for concreteness were similar to those used by Guasch et al. (2016), where 1 = highly abstract word, and 7 = highly concrete word. In all the cases, participants were also able to indicate that they did not know the word (“No conozco la palabra”). The exact wordings in Spanish as well as an English translation of the instructions are also provided in Appendix 1. Words were randomly displayed five at a time on each screen and for each participant, with a rating scale under each word. After rating a set of five words, participants clicked on the ‘Next’ button to display another five words set. The name of the variable under measure (e.g., prototypicality) was displayed at the top and bottom of each screen. Participants took about 20 min to complete a questionnaire.

Results

Data cleaning and descriptive statistics

Each questionnaire was rated by at least 20 participants. Individual ratings that correlated less than .10 with the average ratings of the rest of participants were removed and replaced with ratings from a new participant who completed the respective questionnaire (0.97% of the collected data). We also removed and replaced those data sets where the same rating value was given for more than 95% of the words in a questionnaire (0.09% of the collected data). Most words in the prototypicality ratings (82.1%) were known by at least 18 respondents (90% of responses), but a subset of the words in the list were not known by a majority of raters: ratings for 55 words (4.3%) are based on less than 50% of responses (i.e., no more than ten participants; with the remaining indicating they did not know the word); 12 words (0.9%) obtained less than 25% of responses (i.e., no more than five participants), and the word oprobio (opprobrium) was not known (and therefore not rated) by any of the participants filling that particular questionnaire, so it was consequently eliminated from the final dataset.

Descriptive statistics and density plots for all the collected variables are presented in Table 1 and Fig. 1, respectively. The data presented there includes ratings obtained in the present study as well as those taken from pre-existing databases (the number of new ratings collected through questionnaires is as follows: 1286 words for prototypicality, 557 for arousal and valence, 570 for happiness, 582 for sadness, 566 for fear, 561 for disgust, 571 for anger, 789 for AoA, and 835 for concreteness). Apart from those variables, the table also includes a frequency measure (Zipf value; Van Heuven et al., 2014), obtained from EsPal (Duchon et al., 2013). We also computed the variable ‘emotionality’ to assess the hedonic or affective load of the words (i.e., similar to the valence measure used by Zammuner, 1998 and Niedenthal et al., 2004). To this end, we subtracted 5 (the midpoint of the 1-to-9 scale) from the valence score of each word. This variable, which was expressed as the absolute value of that subtraction, ranged from 0 to 4, where 0 means a non-affectively loaded (neutral) word and 4 means an extremely loaded (either pleasant or unpleasant) word. As can be seen in Fig. 1, the distribution of emotionality values fitted a normal curve and had a positive skew of .366. Valence ratings showed two peaks, fitting rather a bimodal than a normal distribution. Happiness and disgust have the highest positive skews (0.821 and 1.339, respectively), with the median values clearly lower than the midpoint of the scale (see also Table 1). By contrast, arousal and AoA measures show a moderate negative skew (– 0.755 and – 0.740, respectively), with median values higher than the midpoint of the corresponding scale. Mean and median Zipf frequency values (over 3.2 points) show that our words are centered near the midpoint of the intuitive scale suggested by Van Heuven et al.’s (2014) for this variable: Values of 1–3 represent low frequency words, and values of 4–7 represent high frequency words. Note, however, that our database does not contain very high frequency words, with a maximum Zipf value of 5.61 (a Zipf value of 6 and 7 correspond to approximately 1000 and 10,000 occurrences per million, respectively; see Van Heuven et al., 2014). The rest of the variables show low skews and ranges that cover the entirety of the corresponding scale. Overall, the words in the dataset tend to be mid-frequency, late-acquired words, with a tendency to denote mid-to-high arousal levels, mid-emotionality levels, and there are more unpleasant than pleasant words.

Table 1 Descriptive statistics of the variables (complete set)
Fig. 1
figure 1

Density plots of each of the variables

Reliability and validity of the norms

We computed the standard error of the mean and the confidence intervals at 95% of each word for prototypicality ratings as accuracy measures. The standard error of the mean ranged from 0 to 0.48, and the overall mean was 0.28. The error margin means ranged from 0 to 0.94 and the overall mean was 0.55. We also explored the inter-rater reliability of the ratings by calculating the intraclass correlation coefficients (ICCs) for all the questionnaires using the psych package in R (Revelle, 2019). Since there were several questionnaire versions for each variable (eight for prototypicality and five for the remaining variables), we present in Table 2 the average ICC and some variability statistics for each variable.

Table 2 Mean (M), standard deviation (SD), coefficient of variation (CV), and range for the ICC values of the questionnaires for each variable

We also assessed the validity of our ratings. The common approach for this purpose is to compare the new values to those reported in previous normative studies. However, there are no such studies for emotional prototypicality in Spanish, so, we compared our ratings with those of the translation equivalent words we had in common with studies conducted in other languages. Pearson correlations were moderate to high in all comparisons, including the words in common with databases in American English (Shaver et al., 1987, study 1), r(170) = .531; Basque (Alonso-Arbiol et al., 2006), r(122) = .601; French (Niedenthal et al., 2004), r(194) = .628; and Italian (Zammuner, 1998), r(144) = .495 (Bonferroni adjusted alpha level of .013 per test (.05/4), with p < .001 in all correlations).

The validity of the remaining variables was assessed by comparing the ratings of the control words in our database to their ratings in the Spanish studies from which they were obtained. Significant correlations were found for each variable (Bonferroni adjusted alpha level of .003 per test (.05/17), with p < .001 in all correlations): r(186) = .971 and r(192) = .588 with the valence and arousal ratings from Stadthagen-Gonzalez et al. (2017), respectively; r(108) = .970 and r(74) = .965 with the happiness scores, r(93) = .956 and r(84) = .935 with the sadness scores, r(101) = .925 and r(77) = .947 with the fear scores, r(115) = .944 and r(63) = .947 with the disgust scores, r(106) = .928 and r(73) = .937 with the anger scores from Ferré et al. (2017) and Stadthagen-González et al. (2018), respectively; r(261) = .964, r(139) = .960, and r(124) = .935 with the AoA ratings from Alonso et al. (2015), Alonso et al. (2016) and Davies et al. (2013), respectively; and finally, r(153) = .874 and r(158) = .880 with the concreteness scores from Guasch et al. (2016) and ESPaL (Duchon et al., 2013), respectively.

Prototypical emotion words

Prototypicality ratings for the words included in the database are distributed across the entire range of the scale. These ratings show a low skew, a global mean score close to the midpoint of the scale, and an approximate normal distribution (see Table 1 and Fig. 1). In order to characterize the profile of the prototypical emotion words in Spanish, we performed several descriptive and exploratory analyses. First, we selected those words with mean ratings equal or above 3, the midpoint of the prototypicality scale. This high-prototypicality subset was comprised of 549 words (43% of the entire set) and the descriptive statistics for all the variables are shown in Table 3.

Table 3 Descriptive statistics of the variables for the high prototypicality subset of words

We then categorized each word of the subset as ‘pleasant’/‘unpleasant‘, ’low-arousal’/’high-arousal’, ‘not related’/‘very related’ to a discrete emotion, ‘early-acquired’/’late-acquired’ in life, ‘abstract’/‘concrete’, ‘low-frequency’/’high-frequency’, and belonging to a particular grammatical category (noun, adjective, or verb). Regarding affective variables, and following Ferré et al. (2017), we classified as pleasant words those with mean valence ratings ≥ 6, whereas unpleasant words were those with mean valence ratings ≤ 4; low-arousal words were those with mean arousal ratings ≤ 4, while high-arousal words were those with mean arousal ratings ≥ 6; and finally, following Stadthagen-González et al. (2018), words not related with any discrete emotion were those with a mean rating < 3 in all discrete emotions, whereas words very related with a discrete emotion were those with a mean rating ≥ 3 in at least one discrete emotion. With respect to the other variables, we considered as early-acquired words those with AoA mean ratings < 6, and late-acquired words those with mean ratings ≥ 6; abstract words those with concreteness mean ratings < 4, and concrete words those with mean ratings ≥ 4. Finally, and following Van Heuven et al. (2014), we classified as low-frequency words those with Zipf < 4 (less than ten occurrences per million words) and as high-frequency words those with Zipf ≥ 4 (equal or above ten occurrences per million words). We then calculated the number and the percentage of each type of word in the high-prototypicality subset, which are represented in Fig. 2. The figures and the corresponding Chi-square tests of independence (all at p < .001) show that the highly prototypical emotion words are mostly unpleasant (2 out 3 words, X2 (1, n = 500) = 42.22), high-arousal (2 out 3 words, X2 (1, n = 409) = 85.29), representative of at least one discrete emotion (4 out 5 words, X2 (1, n = 548) = 33.06), late-acquired in life (9 out 10 words, X2 (1, n = 549) = 533.42), concrete (2 out 3 words, X2 (1, n = 549) = 32.60), low-frequency (4 out 5 words, X2 (1, n = 542) = 278.16), and adjectives (almost 1 out 2 words, X2 (2, n = 549) = 39.87).

Fig. 2
figure 2

Percentage and number of words in the high prototypicality set according to distinct classifications

We examined in depth the discrete emotions to which the highly prototypical subset of words was related to. For that purpose, we calculated the number and percentage of highly prototypical words very related with one or more discrete emotions (i.e., words with ratings > 3 in that emotion; note that a given word can be simultaneously related to more than one discrete emotion; see Fig. 3a). The discrete emotion with the highest representation on the subset is sadness, with 207 (37.7%) words, followed by anger and happiness, with 147 (26.8%) and 145 (26.4%), respectively, and fear, with 120 (21.9%). Disgust is only marginally represented, with 34 (6.2%) words. We also examined whether words that are highly related with one or more emotions were pure words. Following Stadthagen-González et al. (2018), pure words are defined as those strongly related (i.e., with a rating of 3 or more) to a particular emotion but unrelated (i.e., ratings lower than 3) to the other emotions. The percentage of pure words in relation to the words very related with each discrete emotion was also computed. Results showed that happiness was the discrete emotion with the highest percentage of pure words (98.6%, n = 143), followed by sadness (42.5%, n = 88), anger (29.9%, n = 44), fear (18.3%, n = 22), and disgust (18.3%, n = 9) (see Fig. 3a). A complementary analysis was performed by calculating some analogue metrics of those usually employed to characterize the words’ perceptual load profile (e.g., Morucci et al., 2019; see Lynott & Connell, 2009). First, we calculated the dominant emotion of each word (the analogue of ‘dominant modality’ in terms of Lynott and Connell, 2009) as the discrete emotion with the highest score among the five discrete emotions. We then calculated the emotional exclusivity of each word (the analogue of ‘modality exclusivity’ in terms of Lynott and Connell, 2009), as the range of the scores across the five discrete emotions divided by summed single scores in each discrete emotionFootnote 2.This metric goes from 0 to 1, where a value of emotional exclusivity of 0 means that the word is fully multi-emotional (i.e., a word with exactly the same score in all the discrete emotions), and a value of 1 means that the words is purely uni-emotional (i.e., a word with a given score higher than the minimum value of the scale in only one emotion and with the minimum score in the rest). Figure 3b shows the distribution of words in terms of dominant emotion rate and mean emotional exclusivity, classified by their dominant emotion. We found that 184 words among the highly prototypical subset have sadness as the dominant emotion, followed by happiness with 177 words, anger with 117 words, fear with 60 words, and finally disgust with only 14 words. Happiness has the highest emotional exclusivity mean (M = .65, SD = .22) and the rest of discrete emotions show moderate means (sadness, M = .36, SD = .08; anger, M = .32, SD = .07; fear, M = .32, SD = .09; and disgust, M = .37, SD = .08).

Fig. 3
figure 3

Spider plots illustrating a) the percentage of very related and pure words, and b) rate of dominant emotion (number of words with a dominant emotion divided by the total of words of the subset), and emotion exclusivity mean across the discrete emotions in the highly prototypical words subset

Finally, we examined the features of the ten words with the highest prototypicality ratings. As can be seen in Table 4, the names of the discrete emotions (and some synonymous words), except disgust, are among the most prototypical emotion words in Spanish. All these words have a high load in the distinct affective variables, with four pleasant and six unpleasant words, and with six high-arousal and no low-arousal words. Moreover, all of them are very related with a discrete emotion, and all of them except atemorizado”(daunted) are pure words: Four happiness-related words, two sadness-related words, one fear-related word, and two anger-related words. Disgust is the only discrete emotion not represented in this subset. The top word very related with such emotion is “odio” (hate), which appears at the 26th position in descending order according to its prototypicality. Additionally, the name of the emotion itself, asco (disgust), occupies the 84th place in the descending order of prototypicality, and the most prototypical pure word belonging to the disgust category is desagrado” (displeasure) at the 110th position. Given all this, it seems that disgust-related words are not well represented among emotion words in Spanish.

Table 4 Affective and psycholinguistic features of the ten most prototypical words

Bivariate relationships

Table 5 presents the matrix of pairwise correlations between all the studied variables for the entire set of words. Prototypicality, the focus of the present work, is most closely correlated with emotionality (note, however, that the correlation is moderate), indicating that highly prototypical emotion words are also more affectively loaded. There are also some more modest but significant relationships with the other affective variables, namely a positive correlation with arousal and a negative correlation with valence. There are positive correlations with sadness, fear and anger. Concerning the non-affective variables, the only significant (negative) correlation is that observed with AoA, whereas neither lexical frequency nor concreteness show any significant relation with prototypicality.

Table 5 Correlation matrix of the analyzed variables

Underlying structure of emotional prototypicality

To explore the inner structure underlying all the affective and psycholinguistic variables and particularly to find out the locus of emotional prototypicality in that structure, the scores of the entire set of words were subjected to an exploratory factor analysis with variable maximization (varimax) rotation. The Kaiser–Meyer–Olkin measure, KMO = .810, confirmed the sampling adequacy for the analysis, and the Bartlett’s test of sphericity, X2 (66) = 9444.3, p < .001, indicated that correlation structure is adequate for factor analyses. The principal components analysis with the Kaiser’s criterion of eigenvalues greater than 1 showed a four-factor solution, accounting for 72.91% of the overall variance (see Table 6). The first rotated factor, which accounted for 40.23% of the variance, is clearly formed by the five discrete emotion variables plus valence. Valence and happiness have negative loads on this factor as high scores in both variables indicate pleasantness, while anger, sadness, fear, and disgust go in the opposite direction, in which high scores indicate good examples of an unpleasant emotion. The second factor, which accounted for 15.17% of the variance, is made up of prototypicality, emotionality and arousal, all of them with positive loads. The third factor explained the 8.96% of the variance and is formed mainly by Zipf and AoA, and also emotionality to a lesser extent. Finally, there is a fourth component that explained 8.53% of the variance and is primarily made up of concreteness and also AoA to a lesser extent.

Table 6 Exploratory factor analysis of the complete set for emotional variables

Prediction of prototypicality

A multiple linear regression analysis was carried out with the complete set of words to predict prototypicality based on the other variables. First, we performed the analysis introducing all the 11 predictor variables. However, this model showed inadequate multicollinearity coefficients for valence (tolerance = .066 and variance inflation factor, VIF = 15.245) and happiness (tolerance = .089 and VIF = 11.211). Therefore, we performed a second analysis but following the stepwise method, which reduces the regression model to those factors that slightly contribute to the explained variance, and therefore may avoid an excessive multicollinearity; as a matter of fact, acceptable multicollinearity coefficients were obtained, with a minimum tolerance of .235 and a maximum VIF of 4.251 for happiness. The resulting model equation was significant, F(8, 1259) = 42.544, p < .001, with an R2 of .212 (adjusted R2 of .208), and with eight factors included in the model (see Table 7). The variables with significant standardized Beta-weights, from highest to lowest, are: Sadness, happiness, emotionality, disgust, arousal, anger, AoA, and Zipf, showing that prototypicality ratings are partially explained by six affective variables (emotionality, arousal, sadness, happiness, anger, and disgust) and by two lexico-semantic variables (AoA and lexical frequency).

Table 7 Linear regression analysis on prototypicality

Grammatical class, word family, and prototypicality

The lexical base of the words (i.e., a concept of Spanish linguistics referring to the word used to generate other word(s) by morphological conversion) was used to classify them into lexical-semantic familiesFootnote 3. For instance, “admiración” (admiration) and “admirado” (admired) are derived from “admirar” (to admire), and therefore the three words were classified as belonging to the same word family “admirar”. There were 499 different lexical-semantic families, of which 90 (18.0%) contained only one word, 144 (28.9%) contained two words, 198 (39.7%) contained three words, 43 (8.6%) contained four words, 14 (2.8%) contained five words, and 10 (2.0%) contained 6–11 words. We explored whether words of different grammatical classes but belonging to the same family are similarly rated in emotional prototypicality, by examining only those families that contained the three grammatical classesFootnote 4. First, we obtained the pairwise correlations of prototypicality ratings between grammatical classes, where each word family was considered as an observation. All correlations were positive, high, and significant (Bonferroni adjusted alpha level of .016 per test (.05/3), and p < .001 in all correlations), with r(369) = .709 for adjectives and nouns, r(264) = .681 for adjectives and verbs, and r(246) = .726 for nouns and verbs. These correlations show that prototypicality ratings are overall associated to lexical base and not to a specific grammatical class. However, this does not mean that words of different grammatical classes within the same family have the same prototypicality score. Hence, to test for possible differences in prototypicality by grammatical class, we carried out a repeated-measures ANOVA to compare the overall means of prototypicality in adjectives, nouns and verbs with the word family as the random factor. Only word families with data in the three categories were included in the analysis (n = 237). Mauchly’s test indicated that the assumption of sphericity was not violated, X2(2) = 1.443, p = .486. The ANOVA showed significant differences between the three categories, F(2,472) = 35.847, p < .001. Three paired post hoc t tests with a Bonferroni adjusted alpha level of .016 per test (.05/3) confirmed that adjectives (M = 3.11, SD = 0.71) are slightly more prototypical than nouns of the same family (M = 2.89, SD = 0.80), t(236) = 5.362, p < .001. Adjectives are also more prototypical than verbs (M = 2.79, SD = 0.73), t(236) = 8.400, p < .001, and nouns are more prototypical than verbs, t(236) = 2.827, p = .005 (see Fig. 4).

Fig. 4
figure 4

Boxplots of differences in prototypicality scores per word root in the three comparisons by grammatical categories

Discussion

The main aim of this work was to provide subjective norms for emotional prototypicality for a large set of Spanish words. The second aim was to identify the affective and psycholinguistic variables that contribute the most to prototypicality. The norms derived from our study contain ratings for 1286 words in ten different variables.

Concerning the first aim of the study, it should be noted that the norms show high indices of reliability and validity. With respect to reliability, valence, arousal, the five discrete emotion variables, and AoA were the most consistently rated. Prototypicality and concreteness also exhibited high values. Thus, all the variables examined showed good or excellent inter-rater reliabilities (see Parsons et al., 2019). These results are in line with the good and excellent reliability coefficients obtained in previous normative studies, by using diverse methods (e.g., Alonso et al., 2015, 2016, for AoA ratings; Ferré et al., 2017; Guasch et al., 2016; Hinojosa, Martínez-García et al., 2016; Hinojosa, Rincón-Pérez et al., 2016; Stadthagen-Gonzalez et al., 2017, 2018, for affective and concreteness ratings; and Zammuner, 1998, for prototypicality ratings).

Validity was examined by comparing our ratings with those of their translation equivalents in other languages whenever they were available, since there are no previous studies in Spanish. Pearson correlations were moderate to high in all comparisons, and are similar to those found by Zammuner (1998), who reported a r(130) = .61(p < .001) in the comparison between prototypicality ratings of Italian terms and their American equivalents (Shaver et al., 1987). Van Goozen and Frijda (1993) reported similar results when they compared the emotion words most frequently written by participants from six European countries (England, Netherlands, Belgium, France, Switzerland, and Italy) in a free listing task. They obtained pairwise correlations that ranged from .21 to .91. Although more transcultural studies using the same methodological approach are clearly needed, what these results reveal is that prototypicality ratings seem to be highly similar across languages (i.e., English, French, Italian, Basque, and Spanish), generations (i.e. studies carried out in four different decades), and cultures (see Mesquita and Frijda, 1992, for a review on cultural variations in emotions).

The validity of the remaining variables was assessed by comparing the ratings of the control words with those from other normative studies in Spanish. All paired correlations (except arousal) are very high (r > .87), indicating an adequate criterion validity of the variables collected in the present work. The case of arousal deserves special attention: the validity of arousal ratings shows a wide range of values across studies, being commonly lower than that obtained for valence (see, for instance, Leveau, Jhean-Larose, Denhière, & Nguyen, 2012). As there is no clear explanation for that finding, we can suggest some speculative hypotheses that would need to be tested against empirical data. One possibility is that arousal, as a concept, is more difficult to be understood than valence. It may be that some raters understand arousal as a feature completely separated from valence (i.e., a feeling that can be assessed on its own, like a feeling of stimulation, excitation, etc.), but others understand it as closely related to valence. This last group might conceive arousal as the intensity of valence (i.e., the intensity of the pleasant or unpleasant experience; see Kron et al., 2015). This different conception of arousal may contribute to variability in ratings across participants. Another possible cause of such greater variability is the higher relationship of arousal with a physiological experience than valence. It may be that the accurate rating of activation requires a more experiential correlate than valence may require. Therefore, our participants may focus more on their feelings or on their physiological reaction when assessing arousal than valence, whose assessment may be more based on their semantic knowledge about the pleasant or unpleasant nature of the stimulus. This hypothesis has not been directly studied so far but it is suggested from some available evidence. Nicolle and Goel (2013) asked participants to rate the believability of a number of sentences, and then examined the influence of given ratings on valence and arousal estimations. They observed that valence, but not arousal, was influenced by the extent to which stimuli are consistent or inconsistent with beliefs. From this result, the authors concluded that both variables are penetrable to different degrees by cognitive processing.

Apart from examining the reliability and validity of our norms, we carried out several analyses to characterize the words in our database. Most of those analyses were restricted to the high-prototypicality subset of words. The analyses on affective variables revealed that the highly prototypical words were mostly unpleasant words with a high level of arousal. This result agrees with previous studies in the field. For instance, Schrauf and Sanchez (2004) conducted a free listing study with Spanish (Mexico) and English (USA) speakers. After producing the words, participants had to classify them into pleasant, unpleasant and neutral. Participants produced more words related to negative emotions (48%) than to positive emotions (32%). Similarly, Zammuner (1998) reported that 65% of the stimuli in her dataset were considered by the speakers to be unpleasant words while 36% were considered to be pleasant words. The simplest explanation would be that there are more unpleasant emotion words because there is a wider variety of unpleasant emotions (sadness, fear, disgust, and anger) while, by contrast, there are less pleasant words crowded into only one pleasant emotion (happiness). Another more elaborated explanation for the preponderance of unpleasant words in the emotion lexicon is the feelings-as-information theory , which proposes that unpleasant emotional experiences signal that “there is a problem” (e.g., Schwarz and Bless, 1991; Schwarz, 1990). Consequently, unpleasant experiences trigger a detailed cognitive analysis of the experience in order to find a solution. Such detailed processing would result in more differentiated labels for mental and emotional experiences. By contrast, pleasant emotional experiences signal that the environment is safe and that there is no need for a detailed information analysis; as a consequence, they trigger more general cognitive processing.

Besides their characterization in terms of valence and arousal, the highly prototypical words in our study were closely related to at least one discrete emotion. In order to know which was the most prevalent discrete emotion in the dataset, we conducted a series of additional analyses, by computing the number and percentage of highly prototypical words highly related with one or more discrete emotion, the distribution of pure words (see Stadthagen-González et al., 2018), the dominant emotion, and the emotional exclusivity of each word (see Lynott & Connell, 2009). Overall, the results of these analyses revealed that the most dominant emotion in the highly prototypical set of words was sadness, although words related with sadness were not the most abundant among pure words or those with the highest emotion exclusivity values either (see Fig. 3). In fact, the greatest number of pure words and those with the highest emotional exclusivity values were the ones associated with happiness. Disgust deserves a special mention for being the discrete emotion less represented in terms of very related, pure, and dominant emotion words. Interestingly, when we focused on the ten words with the highest prototypicality ratings, the pattern was very similar to that in the entire subset of highly prototypical words, with a clear preponderance of unpleasant and highly arousing words which are very related to a discrete emotion (except disgust). Of note, there is great coincidence in the most prototypical emotion terms across studies and languages. Indeed, Frijda et al. (1995) and Van Goozen and Frijda (1993) conducted a free listing study with native speakers of six European countries in which the authors identified the 12 words with the highest prototypicality ratings. In all the languages, words related to happiness and sadness appeared in the top places of the list. Similarly, Alonso-Arbiol et al. (2006) reported that the top 12 words in their list were related with happiness, anger, sadness and fear (but not with disgust).

We also characterized the subset of highly prototypical words in terms of non-affective variables. We observed that highly prototypical words tended to be concrete and low frequency words, mostly acquired late in life. In terms of grammatical category, there were more adjectives than nouns or verbs among them. The preponderance of adjectives in this subset of words fits well with the effects of grammatical category on emotion prototypicality observed in the entire database. When we examined possible differences in prototypicality by grammatical category within the same family, we observed that adjectives are the best representatives of an emotion within the same word family, followed by nouns, and then verbs. These findings argue against the common practice of selecting only nouns to study the emotion lexicon (e.g., Alonso-Arbiol et al., 2006; Shaver et al., 1987, 2001).

The second aim of this work was to identify the affective and psycholinguistic variables that show the highest contribution to prototypicality. To that end, we computed Pearson correlations between the variables, and carried out both a factorial analysis and a regression analysis. The pattern of correlations indicates that the more prototypical the emotion words, the more affectively loaded (i.e., with higher scores in the emotionality variable), unpleasant, arousing, highly related with sadness, fear and anger, and acquired earlier during development they are. Considering only the affective variables, our results agree with those of Zammuner (1998) and Niedenthal et al. (2004), who also found a significant relation between prototypicality and valence (i.e., the equivalent of the emotionality variable in this study). The other affective variables investigated here were not explored in those previous studies, which reported the strongest correlation with intensity. This variable seems to be correlated with arousal, but clearly includes other emotion components (Frijda et al., 1992), as suggested by the lower correlation between prototypicality and arousal found here. With respect to the psycholinguistic variables, the results for concreteness do not support the proposal of Vigliocco and co-workers (e.g., Vigliocco et al., 2009), according to which abstract words would be more affectively loaded than concrete words. If that were the case, we should have found a negative correlation between prototypicality and concreteness (i.e., the most prototypical words should be the most abstract ones). Nevertheless, a possible reason of the lack of correlation here may be that the concreteness range was too small to allow for a significant correlation to emerge (in fact, the range is only four points for a 1-to-7 scale, and the standard deviation for concreteness is the lowest of the variables examined). Moreover, the present findings agree with the negative correlation between prototypicality and AoA reported by Niedenthal et al. (2004), but not with the positive correlation between prototypicality and lexical frequency shown by both Niedenthal et al. and Zammuner (1998). The lack of correlation with Zipf values is an unexpected result. One possible explanation could be found in the type of words examined here. The most obvious difference in this regard is the proportion of words belonging to distinct grammatical categories. Niedenthal et al. and Zammuner used mainly nouns and a very small number of adjectives. By contrast, in our study 42.8% of words are adjectives, 34.7% are nouns, and 22.2% are verbs. By examining our data, we observed that the sign of the correlations with prototypicality is the same along the three grammatical categories. The only exception is lexical frequency (Zipf), with r(547) = – .092, p = .032 for adjectives, r(445) .092, p = .053 for nouns, and r(286) = – .129, p = .032 for verbs (Bonferroni adjusted alpha level of .016 per test (.05/3)). That is, we found a positive but marginally significant correlation between prototypicality and Zipf only for nouns, which is in line with the Niedenthal et al. and Zammuner’s findings. This result leaves open the possibility that the inconsistent findings regarding the relationship between prototypicality and frequency are related to the grammatical category of the words included in each database.

The exploratory factor analysis was carried out to provide information about the inner structure underlying all the affective and psycholinguistic variables, and more specifically, to find out in which factor or factors the emotional prototypicality loads. The results showed that the variables were grouped into four theoretically interpretable components or factors. The first factor, which we could call emotional polarity, is made up of valence and all the discrete emotions. These variables somehow refer to the affective polarity, being either pleasant or unpleasant, of words. Valence is in fact a direct estimation of that, and the discrete emotion variables measure how representative are the words of the four unpleasant emotions (sadness, fear, anger and disgust) and of the only pleasant emotion (happiness) included in this database. Emotional prototypicality does not have a relevant load in this factor but in the second factor. We may conceptually tag the second factor as emotional salience because it is formed by emotionality, which is considered as a measure of the hedonic or affective load of words, and arousal, which denotes the degree of activation. Emotional prototypicality has a high load on this factor. The third factor can be described as lexical factor because it is composed by two well-known lexical variables, Zipf and AoA, which again appear associated to each other (e.g., Brysbaert, & Cortese, 2011), and also by emotionality to a lesser extent. Finally, both concreteness (to a larger extent) and AoA (to a lesser extent) load on the fourth factor. This is not the first time that AoA appears associated to word concreteness (e.g., Montefinese et al., 2019). Considering that AoA has been proposed to affect not only lexical forms but also the semantic representation of words (Brysbaert et al., 2000), this fourth factor could be labelled as semantic factor. Overall, the results from the factor analysis show four inner factors that theoretically match with the constructs of emotional polarity (pleasant and unpleasant emotions), hedonic salience or intensity (emotionality, arousal, and prototypicality), lexical features (mainly frequency and AoA, but also emotionality) and semantic features (mainly concreteness but also AoA).

The results of the regression analysis mostly agree with those of the correlations and the factor analysis. Eight variables (out of 11 introduced in the analysis) appeared as significant predictors of prototypicality. Those variables were emotionality, arousal, sadness, happiness, disgust, anger, AoA and Zipf. These results are similar to those obtained by Zammuner (1998) and Niedenthal et al. (2004). We need to be cautious, however, in the comparison between those previous studies and ours, because the number of factors considered in the regression analysis is not the same. In spite of that, what clearly emerges from the comparison of the three studies is that both emotionality (labelled as valence by both Zammuner and Niedenthal et al.) and arousal (similar to the intensity variable considered by those authors) are significant predictors of prototypicality. The contribution of discrete emotions to the prediction of prototypicality is a novel finding of this study, as it has not been investigated before. Importantly, both emotionality and arousal (but not valence) persisted as significant predictors when discrete emotions were included in the regression analysisFootnote 5. Thus, considering the factor and regression analyses together, emotional prototypicality is characterized by the hedonic load (emotionality) and the degree of activation (arousal) conveyed by words. Furthermore, the regression analyses also showed some influence of the best representative words of the four discrete emotions (happiness, sadness, anger and disgust). The positive beta-weights of happiness, sadness, and anger indicate that high scores in these emotions predict high emotional prototypicality values as well, and the negative beta-weight for disgust indicate the opposite, namely, that high scores in this emotion predict low prototypicality values. This last result agrees with the findings reported above, showing that disgust is the discrete emotion less represented in the high prototypical subset of words.

Finally, regarding the role of psycholinguistic variables, it should be noted that AoA persisted as a relevant predictor in the regression analysis even when lexical frequency was also included in the model. As a matter of fact, both variables contributed to a similar extent to prototypicality ratings. This suggests that the most prototypical words, regardless of their frequency, are acquired earlier than the less prototypical words. This result does not agree with that reported by Niedenthal et al. (2004), who concluded that the effects of AoA on prototypicality seemed to be mediated by subjective lexical frequency. In fact, Niedenthal et al. found that word frequency was a preponderant predictor of prototypicality, mediating not only the influence of lexical frequency but also that of other affective variables. In this respect, it is worth noting that Niedenthal et al. relied on the Zammuner’s (1998) word list to select their materials, and, importantly, they substituted some terms that were not very frequent in French by others used more often, and also added a number of frequent emotion words. This fact may have contributed to the strong association found by Niedenthal et al. between subjective word frequency and prototypicality. By contrast, our selection of materials was not based (at least not exclusively) on word frequency, as we set out to create a set of norms for potential emotion words that was as comprehensive as possible. This may explain the modest predicting capacity of frequency on emotion prototypicality in our study.

Limitations of the present work and future research

A potential limitation of this study concerns the generalization of the present norms to all Spanish speakers. Spanish is the native and official language of millions of speakers distributed across a large and diverse geographic and cultural area around the world. Although there is a common dictionary developed by The Real Academia Española (RAE, 2014) in close contact with other Latin American academies, there are linguistic and cultural differences across the Spanish-speaking populations and communities (see e.g., the Diccionario de Americanismos by ASALE, 2010; online version at https://lema.rae.es/damer/). Our norms were obtained from a sample of participants from different geographical regions of Spain, so the ratings are representative of the Spanish spoken in our country. For that reason, a direct application of the norms to other Spanish-speaking populations of Latin America may not guarantee a proper discrimination or selection of words by their emotional prototypicality.

Considering the above, it would be interesting to expand and compare the present norms with new emotional prototypicality ratings collected in other Spanish-speaking countries and cultures. Moreover, future studies should establish the relevance of prototypicality by examining its role on different psycholinguistic and cognitive tasks. For instance, it remains to be determined whether prototypicality affects word processing, as valence and arousal do. Finally, further studies should be developed to better characterize the representation of emotion words, in terms of non-affective variables, like physical/sensorial and cognitive/inner variables.

Conclusions

To conclude, we have presented the EmoPro norms, the first data set of prototypicality ratings for virtually the entire emotion lexicon in Spanish, and to date the largest such database for any language, with ratings for 1286 words. The highly prototypical emotion words are mostly unpleasant, highly arousing, late-acquired in life, concrete, and low-frequency words, with adjectives as the most representative grammatical category. Regarding the entire set of words, emotional prototypicality is determined by the hedonic load (emotionality) and the degree of activation (arousal) conveyed by words. With respect to discrete emotions, sadness and happiness are the more related with prototypicality, while disgust seems to be the less representative of a prototypical emotion. The present database will prove valuable for researchers in the field by providing them with a data-driven way of selecting the most representative emotion words for their studies.

Availability of the norms

The full set of norms can be found in a comma-separated format (.csv) and in spreadsheet format (.xls) as supplementary materials to this article. Both files contain identical data. The words are organized alphabetically (in Spanish) and the headings for the datasheet are as follows (Substitute [Variable] for Prototypicality, Valence, Arousal, Happiness, Disgust, Anger, Fear, Sadness, AoA, and Concreteness): Word = Word in Spanish; Few_Raters – an “X” on this column marks words with less than ten ratings for prototypicality; [Variable]_Mean = [Variable] mean value for all valid responses; [Variable]_SD = Standard Deviation of [Variable] ratings; [Variable]_NRatersFootnote 6 and [Variable]_%RatersFootnote 7 = number and percentage of participants (out of the maximum for the corresponding questionnaire), respectively, that knew and rated the word in [Variable]; [Variable]_Source = databases (named by the first author’s surname + publication year) used to obtain the mean rating, the standard deviation, and the number and percentage of raters, when available, for those words not rated in the present study; Zipf_EsPal = the log10(word frequency per million)+3, from EsPal-Written and Web Tokens database (Duchon et al., 2013); POS = part of speech (adjective, noun, or verb) of each word categorized by the authors; Emotionality = absolute value of Valence – 5; Family = lexical-semantic family of each word based on BDME TIP (Pena, 2019); Very_related_emotions = number of emotions to which the word is very related (with a rating > 3); Pure_word = 0 is not a pure word, and 1 is a pure word (with a rating > 3 to a particular emotion but < 3 to the other emotions); Dominant_emotion = the discrete emotion(s) in which the word has the highest rating; and finally, Emotional_exclusivity = the range of the scores across the five discrete emotions / (summed single scores in each discrete emotion – 5).