Modality-Transferable Emotion Embeddings for Low-Resource Multimodal Emotion Recognition,arXiv - CS - Computation and Language

当前位置： X-MOL 学术 › arXiv.cs.CL › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Modality-Transferable Emotion Embeddings for Low-Resource Multimodal Emotion Recognition
arXiv - CS - Computation and Language Pub Date : 2020-09-21 , DOI: arxiv-2009.09629
Wenliang Dai, Zihan Liu, Tiezheng Yu and Pascale Fung

Despite the recent achievements made in the multi-modal emotion recognition task, two problems still exist and have not been well investigated: 1) the relationship between different emotion categories are not utilized, which leads to sub-optimal performance; and 2) current models fail to cope well with low-resource emotions, especially for unseen emotions. In this paper, we propose a modality-transferable model with emotion embeddings to tackle the aforementioned issues. We use pre-trained word embeddings to represent emotion categories for textual data. Then, two mapping functions are learned to transfer these embeddings into visual and acoustic spaces. For each modality, the model calculates the representation distance between the input sequence and target emotions and makes predictions based on the distances. By doing so, our model can directly adapt to the unseen emotions in any modality since we have their pre-trained embeddings and modality mapping functions. Experiments show that our model achieves state-of-the-art performance on most of the emotion categories. In addition, our model also outperforms existing baselines in the zero-shot and few-shot scenarios for unseen emotions.

中文翻译：

用于低资源多模态情感识别的模态可转移情感嵌入

尽管最近在多模态情感识别任务中取得了一些成就，但仍然存在两个问题并且没有得到很好的研究：1）没有利用不同情感类别之间的关系，导致性能欠佳；2）当前的模型无法很好地处理资源匮乏的情绪，尤其是那些看不见的情绪。在本文中，我们提出了一种具有情感嵌入的模态可转移模型来解决上述问题。我们使用预训练的词嵌入来表示文本数据的情感类别。然后，学习两个映射函数将这些嵌入转移到视觉和声学空间。对于每种模态，模型计算输入序列和目标情绪之间的表示距离，并根据距离进行预测。通过这样做，我们的模型可以直接适应任何模态中看不见的情绪，因为我们有它们的预训练嵌入和模态映射函数。实验表明，我们的模型在大多数情感类别上都达到了最先进的性能。此外，我们的模型在零镜头和少镜头场景中对于看不见的情绪也优于现有的基线。

更新日期：2020-10-08

点击分享查看原文

点击收藏

阅读更多本刊最新论文