当前位置:
X-MOL 学术
›
arXiv.cs.MM
›
论文详情
Our official English website, www.x-mol.net, welcomes your
feedback! (Note: you will need to create a separate account there.)
A Deep Multi-Level Attentive network for Multimodal Sentiment Analysis
arXiv - CS - Multimedia Pub Date : 2020-12-15 , DOI: arxiv-2012.08256 Ashima Yadav, Dinesh Kumar Vishwakarma
arXiv - CS - Multimedia Pub Date : 2020-12-15 , DOI: arxiv-2012.08256 Ashima Yadav, Dinesh Kumar Vishwakarma
Multimodal sentiment analysis has attracted increasing attention with broad
application prospects. The existing methods focuses on single modality, which
fails to capture the social media content for multiple modalities. Moreover, in
multi-modal learning, most of the works have focused on simply combining the
two modalities, without exploring the complicated correlations between them.
This resulted in dissatisfying performance for multimodal sentiment
classification. Motivated by the status quo, we propose a Deep Multi-Level
Attentive network, which exploits the correlation between image and text
modalities to improve multimodal learning. Specifically, we generate the
bi-attentive visual map along the spatial and channel dimensions to magnify
CNNs representation power. Then we model the correlation between the image
regions and semantics of the word by extracting the textual features related to
the bi-attentive visual features by applying semantic attention. Finally,
self-attention is employed to automatically fetch the sentiment-rich multimodal
features for the classification. We conduct extensive evaluations on four
real-world datasets, namely, MVSA-Single, MVSA-Multiple, Flickr, and Getty
Images, which verifies the superiority of our method.
中文翻译:
用于多模式情感分析的深度多层次专心网络
多模式情感分析已引起广泛关注,具有广阔的应用前景。现有方法集中于单一模式,该模式无法捕获多种模式的社交媒体内容。此外,在多模式学习中,大多数工作都集中于简单地将两种模式组合在一起,而没有探索它们之间的复杂关系。这导致多模式情感分类的性能不令人满意。基于现状,我们提出了一个深度多层次的专心网络,该网络利用图像和文本模态之间的相关性来改善多模态学习。具体来说,我们沿空间和通道维度生成双注意力视觉地图,以放大CNN的表示能力。然后,我们通过应用语义注意力提取与双注意视觉特征相关的文本特征,来对单词的图像区域与语义之间的相关性进行建模。最后,利用自我注意自动获取情感丰富的多峰特征以进行分类。我们对MVSA-Single,MVSA-Multiple,Flickr和Getty Images四个真实数据集进行了广泛的评估,这证明了我们方法的优越性。
更新日期:2020-12-16
中文翻译:
用于多模式情感分析的深度多层次专心网络
多模式情感分析已引起广泛关注,具有广阔的应用前景。现有方法集中于单一模式,该模式无法捕获多种模式的社交媒体内容。此外,在多模式学习中,大多数工作都集中于简单地将两种模式组合在一起,而没有探索它们之间的复杂关系。这导致多模式情感分类的性能不令人满意。基于现状,我们提出了一个深度多层次的专心网络,该网络利用图像和文本模态之间的相关性来改善多模态学习。具体来说,我们沿空间和通道维度生成双注意力视觉地图,以放大CNN的表示能力。然后,我们通过应用语义注意力提取与双注意视觉特征相关的文本特征,来对单词的图像区域与语义之间的相关性进行建模。最后,利用自我注意自动获取情感丰富的多峰特征以进行分类。我们对MVSA-Single,MVSA-Multiple,Flickr和Getty Images四个真实数据集进行了广泛的评估,这证明了我们方法的优越性。