A Deep Multi-Level Attentive network for Multimodal Sentiment Analysis,arXiv - CS - Multimedia

当前位置： X-MOL 学术 › arXiv.cs.MM › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

A Deep Multi-Level Attentive network for Multimodal Sentiment Analysis
arXiv - CS - Multimedia Pub Date : 2020-12-15 , DOI: arxiv-2012.08256
Ashima Yadav, Dinesh Kumar Vishwakarma

Multimodal sentiment analysis has attracted increasing attention with broad application prospects. The existing methods focuses on single modality, which fails to capture the social media content for multiple modalities. Moreover, in multi-modal learning, most of the works have focused on simply combining the two modalities, without exploring the complicated correlations between them. This resulted in dissatisfying performance for multimodal sentiment classification. Motivated by the status quo, we propose a Deep Multi-Level Attentive network, which exploits the correlation between image and text modalities to improve multimodal learning. Specifically, we generate the bi-attentive visual map along the spatial and channel dimensions to magnify CNNs representation power. Then we model the correlation between the image regions and semantics of the word by extracting the textual features related to the bi-attentive visual features by applying semantic attention. Finally, self-attention is employed to automatically fetch the sentiment-rich multimodal features for the classification. We conduct extensive evaluations on four real-world datasets, namely, MVSA-Single, MVSA-Multiple, Flickr, and Getty Images, which verifies the superiority of our method.

中文翻译：

用于多模式情感分析的深度多层次专心网络

多模式情感分析已引起广泛关注，具有广阔的应用前景。现有方法集中于单一模式，该模式无法捕获多种模式的社交媒体内容。此外，在多模式学习中，大多数工作都集中于简单地将两种模式组合在一起，而没有探索它们之间的复杂关系。这导致多模式情感分类的性能不令人满意。基于现状，我们提出了一个深度多层次的专心网络，该网络利用图像和文本模态之间的相关性来改善多模态学习。具体来说，我们沿空间和通道维度生成双注意力视觉地图，以放大CNN的表示能力。然后，我们通过应用语义注意力提取与双注意视觉特征相关的文本特征，来对单词的图像区域与语义之间的相关性进行建模。最后，利用自我注意自动获取情感丰富的多峰特征以进行分类。我们对MVSA-Single，MVSA-Multiple，Flickr和Getty Images四个真实数据集进行了广泛的评估，这证明了我们方法的优越性。

更新日期：2020-12-16

点击分享查看原文

点击收藏

阅读更多本刊最新论文