Multimodal Fusion Method Based on Self-Attention Mechanism,Wireless Communications and Mobile Computing

当前位置： X-MOL 学术 › Wirel. Commun. Mob. Comput. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Multimodal Fusion Method Based on Self-Attention Mechanism
Wireless Communications and Mobile Computing ( IF 2.146 ) Pub Date : 2020-09-23 , DOI: 10.1155/2020/8843186
Hu Zhu ₁ , Ze Wang ₂ , Yu Shi ₃ , Yingying Hua ₁ , Guoxia Xu ₄ , Lizhen Deng ₅

Affiliation

Multimodal fusion is one of the popular research directions of multimodal research, and it is also an emerging research field of artificial intelligence. Multimodal fusion is aimed at taking advantage of the complementarity of heterogeneous data and providing reliable classification for the model. Multimodal data fusion is to transform data from multiple single-mode representations to a compact multimodal representation. In previous multimodal data fusion studies, most of the research in this field used multimodal representations of tensors. As the input is converted into a tensor, the dimensions and computational complexity increase exponentially. In this paper, we propose a low-rank tensor multimodal fusion method with an attention mechanism, which improves efficiency and reduces computational complexity. We evaluate our model through three multimodal fusion tasks, which are based on a public data set: CMU-MOSI, IEMOCAP, and POM. Our model achieves a good performance while flexibly capturing the global and local connections. Compared with other multimodal fusions represented by tensors, experiments show that our model can achieve better results steadily under a series of attention mechanisms.

中文翻译：

基于自我注意机制的多峰融合方法

多峰融合是多峰研究的热门研究方向之一，也是人工智能的新兴研究领域。多峰融合旨在利用异构数据的互补性并为模型提供可靠的分类。多模式数据融合是将数据从多个单模式表示形式转换为紧凑的多模式表示形式。在以前的多峰数据融合研究中，该领域的大多数研究都使用张量的多峰表示。随着输入被转换为张量，尺寸和计算复杂度呈指数增长。本文提出了一种具有关注机制的低秩张量多峰融合方法，该方法提高了效率，降低了计算复杂度。我们通过三个多模式融合任务评估了我们的模型，这些任务基于一个公共数据集：CMU-MOSI，IEMOCAP和POM。我们的模型在获得良好性能的同时，还可以灵活地捕获全局和本地连接。与以张量为代表的其他多峰融合相比，实验表明我们的模型在一系列注意机制下可以稳定地取得较好的结果。

更新日期：2020-09-23

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文

全部期刊列表>>