Joint emotion label space modeling for affect lexica

https://doi.org/10.1016/j.csl.2021.101257Get rights and content

Abstract

Emotion lexica are commonly used resources to combat data poverty in automatic emotion detection. However, vocabulary coverage issues, differences in construction method and discrepancies in emotion framework and representation result in a heterogeneous landscape of emotion detection resources, calling for a unified approach to utilizing them. To combat this, we present an extended emotion lexicon of 30,273 unique entries, which is a result of merging eight existing emotion lexica by means of a multi-view variational autoencoder (VAE). We showed that a VAE is a valid approach for combining lexica with different label spaces into a joint emotion label space with a chosen number of dimensions, and that these dimensions are still interpretable. We tested the utility of the unified VAE lexicon by employing the lexicon values as features in an emotion detection model. We found that the VAE lexicon outperformed individual lexica, but contrary to our expectations, it did not outperform a naive concatenation of lexica, although it did contribute to the naive concatenation when added as an extra lexicon. Furthermore, using lexicon information as additional features on top of state-of-the-art language models usually resulted in a better performance than when no lexicon information was used.

Introduction

Affect lexica are valuable resources in the fields of experimental psychology and natural language processing (NLP). An affect lexicon is a list of words or a database that contains lexical entries with their associated affective value. This value can be conveyed as a polarity association (negative–neutral–positive), in which case we call it a sentiment lexicon, or, following emotion frameworks provided in the field of psychology, it can be denoted as a value associated with an emotion category or emotional dimension. In this case, we use the term emotion lexicon.

In the field of psychology, affect lexica are mostly known as affective norms. These norms can be used as stimuli in emotion research or for designing experiments on word memory and processing (Warriner et al., 2013). Also in NLP, emotion lexica are in high demand, because they can be used to combat data poverty in automatic emotion detection. The lexica can be employed as a straight-forward way to automatically label texts with emotional information, or they can be used as features in supervised machine learning approaches (Ma et al., 2018). Even in state-of-the-art systems for emotion detection (e.g., the winning teams of the SemEval-2018 shared task on multi-label emotion classification), word embeddings in Bi-LSTM architectures are complemented with features from affect lexica (Baziotis et al., 2018, Meisheri and Dey, 2018).

However, methodological issues emerge when employing lexica for emotion detection. Firstly, lexica often cover only a small portion of a dataset’s vocabulary. Furthermore, the way they are constructed can vary widely: from lab conditions in the field of psychology (Bradley and Lang, 1999), over crowdsourced annotations (Mohammad and Turney, 2013) to distant supervision (Mohammad and Kiritchenko, 2015). All these construction methods cause a certain amount of noise, either because of divergence in emotion assessment within or between annotators, or because of imperfections in automatic lexicon creation. Finally, there is currently no consensus on a standard emotion framework: categorical frameworks and dimensional frameworks coexist, in which theorists provide many different sets of categorical labels (Ekman, 1992, Plutchik, 1980) or dimensional axes (Mehrabian and Russell, 1974, Fontaine et al., 2007). This versatility is also reflected in the existing emotion lexica, which show a myriad of different categorical labels or numerical scales. The inconsistency in labels impedes the exchange of data and knowledge resources and calls for a unified emotion lexicon with a high word coverage that consolidates the annotations of the different emotion frameworks.

Although the problem of vocabulary coverage could be tackled by naively concatenating existing emotion lexica and thus having more entries, this approach does not address the problem of disparate label spaces, nor does it deal with the noise introduced during the construction of the lexica. Naively concatenating different lexica results in conflicting information, which makes the lexicon unsuitable for either research in psychology or keyword-based emotion detection, and could hamper learning in supervised machine learning. By contrast, research showed that merging sentiment lexica by using a multi-view variational autoencoder (VAE) outperforms a naive concatenation technique when the lexicon values are used as features in a supervised learning approach for sentiment analysis (Hoyle et al., 2019). The intuition of using a VAE to combine lexica, is that the VAE maps the lexica in a shared latent space, making the information in the different lexica less heterogeneous. Moreover, variational autoencoders are commonly used for their noise filtering ability (Aggarwal et al., 2018) and can thus remove the noise introduced in the lexicon construction process.

We believe there is an additional advantage in using a VAE for combining emotion lexica, namely that the dimension of the VAE’s latent space can be chosen. While a dimension of three is preferred for sentiment (corresponding to positivity, neutrality and negativity), multiple sizes are possible when dealing with emotions (corresponding to different emotion frameworks). If it shows that the latent dimensions indeed correspond to interpretable emotion dimensions, it opens possibilities for creating large lexica tailored to specific emotion frameworks that are usable in psychology and straightforward keyword-based emotion labeling as well.

However, in comparison to sentiment lexica, joining emotion lexica is more complex: where sentiment lexica contain information about the polarity of words (negative-neutral-positive), emotion lexica contain more fine-grained affective states. This complexity is also reflected in the dimensionality of the lexica and the corresponding emission distributions used in the VAE: where most sentiment lexica are unidimensional, all emotion lexica used in this study are multidimensional. Moreover, different concepts or dimensions are quantified in emotion lexica, e.g. anger, sadness, disgust, dominance, etc. This contrasts with sentiment lexica, where the only concept is polarity. The question is thus whether we can still find a meaningful latent space into which the emotion lexica can be mapped.

In this paper, we examine the use of a multi-view variational autoencoder to combine eight existing English emotion lexica in a bigger, joint emotion lexicon and find indications that the chosen dimension of the latent space can be correlated to emotional dimensions present in the source lexica. We then evaluate the joint lexicon on the downstream task of emotion detection on thirteen emotion datasets, by using the lexicon values as features in a logistic/linear regression classifier. We also combine the lexicon features with word embeddings in a Bi-LSTM architecture and show that adding lexicon features improves the performance of plain word embedding models. Contrary to our expectations, the VAE lexicon does not outperform a naive concatenation of lexica, although it does outperform all individual lexica on the task of emotion detection.

Contributions: This paper contributes to the field of emotion analysis in NLP by (a) presenting a unified emotion lexicon of 30,273 unique entries, automatically combined by a multi-view variational autoencoder, and show that this 8-dimensional lexicon is still interpretable (b) bringing together a large set of existing emotion detection resources and thus learn more about the relationships between them; (c) exploring the use of existing lexica and the joint VAE lexicon for the task of emotion detection.

Section 2 describes background on emotion frameworks and related studies that utilize lexica for emotion detection, or that combine lexica and datasets with disparate label spaces. In Section 3, we describe the VAE model (Section 3.1) and show how to interpret the dimensions of the resulting joint emotion label space (Section 3.2). In Section 4, we describe our methodology to evaluate the VAE lexicon on the downstream task of emotion detection (Section 4.1) and report the results (Section 4.2), which we further discuss in Section 5. We end this paper with a conclusion in Section 6.

Section snippets

Related work

In this section, we will focus on briefly discussing the different frameworks in emotion theory (Section 2.1), illustrating the use of lexica for emotion detection (Section 2.2) and describing related studies dealing with different emotion frameworks in NLP (Section 2.3).

Method

There is already a fair number of emotion lexica available for English, however, they all have their own specifics, assets and shortcomings (e.g. regarding emotion framework and vocabulary coverage). Table 2 shows an overview of eight emotion lexica with information about their labels and size. More extensive descriptions can be found in Appendix A.

For maximum vocabulary coverage, it is appropriate to combine multiple lexica when using lexicon information in an emotion detection task. However,

Method

Datasets We evaluate the VAE joint emotion lexicon by using it as features on the downstream task of emotion detection and compare it with the use of the individual lexica and a naive concatenation thereof. The evaluation is done on eleven commonly used emotion datasets (Blogs, Emotion in Text, ElectoralTweets, ISEAR, Tales, TEC, Affect in Tweets, SSEC, Affective Text, EmoBank and Facebook-VA) and two additional datasets also suited for emotion detection (DailyDialog and Emotion-Stimulus).

Discussion

In this section, we discuss some insights about the use of lexica provided by the emotion detection experiments performed in Section 4. We zoom in on factors that have a potential impact on the performance of lexica, namely lexicon size, construction method and quality, label set and dimensionality, trainability of the input representation and lexicon combination strategy.

Conclusion

This paper addressed the problem of disparate label spaces in emotion lexica and presented an extended, unified lexicon containing 30,273 unique entries. The lexicon was obtained by merging eight existing emotion lexica with a multi-view variational autoencoder. We showed that we can choose the dimension of the VAE latent space so that it is still interpretable, corresponding to emotional dimensions present in the source lexica.

We evaluated the VAE lexicon by using it as features in the

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

This research was carried out with funding of the Research Foundation - Flanders, Belgium under a Strategic Basic Research fellowship and supported with a travel grant from Research Foundation - Flanders.

References (63)

  • AtmajaB.T.

    Deep learning-based categorical and dimensional emotion recognition for written and spoken text

    (2019)
  • BaziotisC. et al.

    NTUA-SLP at SemEval-2018 task 1: Predicting affective content in tweets with deep attentive RNNs and transfer learning

  • BostanL.A.M. et al.

    An analysis of annotated corpora for emotion classification in text

  • BradleyM.M. et al.

    Affective Norms for English Words (ANEW): Instruction Manual and Affective RatingsTechnical Report

    (1999)
  • BuechelS. et al.

    A flexible mapping scheme for discrete and dimensional emotion representations

  • Buechel, S., Hahn, U., 2017b. EMOBANK: Studying the impact of annotation perspective and representation format on...
  • BuechelS. et al.

    Emotion representation mapping for automatic lexicon construction (mostly) performs on human level

  • BuechelS. et al.

    Learning and evaluating emotion lexicons for 91 languages

  • BuechelS. et al.

    Learning neural emotion analysis from 100 observations: The surprising effectiveness of pre-trained word representations

    (2018)
  • CambriaE. et al.

    A Practical Guide to Sentiment Analysis

    (2017)
  • ChaffarS. et al.

    Using a heterogeneous dataset for emotion analysis in text

  • ChaumartinF.-R.

    UPAR7: A knowledge-based system for headline sentiment tagging

  • DevlinJ. et al.

    BERT: Pre-training of deep bidirectional transformers for language understanding

  • EkmanP.

    An argument for basic emotions

    Cogn. Emot.

    (1992)
  • EmersonG. et al.

    Sentimerge: Combining sentiment lexicons in a Bayesian framework

  • EsuliA. et al.

    Sentiwordnet: A publicly available lexical resource for opinion mining

  • FontaineJ.R. et al.

    The world of emotions is not two-dimensional

    Psychol. Sci.

    (2007)
  • GhaziD. et al.

    Detecting emotion stimuli in emotion-bearing sentences

  • GiulianelliM. et al.

    Semi-supervised emotion lexicon expansion with label propagation

    Comput. Linguist. Neth. J.

    (2018)
  • HoyleA.M. et al.

    Combining sentiment lexica with a multi-view variational autoencoder

  • IdeN. et al.

    The manually annotated sub-corpus: A community resource for and by the people

  • Cited by (13)

    View all citing articles on Scopus
    View full text