Learning to Learn Better Unimodal Representations via Adaptive Multimodal Meta-Learning,IEEE Transactions on Affective Computing

当前位置： X-MOL 学术 › IEEE Trans. Affect. Comput. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Learning to Learn Better Unimodal Representations via Adaptive Multimodal Meta-Learning
IEEE Transactions on Affective Computing ( IF 11.2 ) Pub Date : 2022-05-27 , DOI: 10.1109/taffc.2022.3178231
Ya Sun ₁ , Sijie Mai ₁ , Haifeng Hu ₁

Affiliation

Multimodal sentiment analysis is an emerging field of artificial intelligence. The most predominant approaches have made notable progress by designing sophisticated fusion architectures, exploring inter-modal interactions between modalities. However, these works tend to utilize a uniform optimization strategy for each modality, so that only sub-optimal unimodal representations are obtained for multimodal fusion. To address this issue, we propose a novel meta-learning based paradigm that can retain the advantages of unimodal existence and further boost the performance of multimodal fusion. Specifically, we introduce the Adaptive Multimodal Meta-Learning (AMML) to meta-learn the unimodal networks and adapt them for multimodal inference. AMML can (1) effectively obtain more optimized unimodal representation via meta-training on unimodal tasks, which adaptively adjusts the learning rate and assigns a more specific optimization procedure for each modality; (2) and adapt the optimized unimodal representations for multimodal fusion via meta-testing on multimodal tasks. Considering multimodal fusion often suffers from the distributional mismatches between features of different modalities due to heterogeneous nature of the signals, we implement a distribution transformation layer on unimodal representations to regularize the unimodal distributions. In this way, distribution gaps can be reduced to achieve a better effect of fusion. Extensive experiments on two widely-used datasets demonstrate that AMML achieves state-of-the-art performance.

中文翻译：

通过自适应多模态元学习学习更好的单模态表示

多模态情感分析是人工智能的一个新兴领域。最主流的方法通过设计复杂的融合架构、探索模式之间的跨模式交互，取得了显着的进展。然而，这些工作倾向于对每种模态使用统一的优化策略，因此对于多模态融合只能获得次优的单模态表示。为了解决这个问题，我们提出了一种新颖的基于元学习的范式，它可以保留单模态存在的优势，并进一步提高多模态融合的性能。具体来说，我们引入了自适应多模态元学习（AMML）来元学习单模态网络并使其适应多模态推理。AMML 可以（1）通过单峰任务的元训练有效地获得更优化的单峰表示，它自适应地调整学习率并为每种模态分配更具体的优化程序；（2）并通过多模态任务的元测试来调整多模态融合的优化单模态表示。考虑到由于信号的异构性，多模态融合经常会遇到不同模态特征之间的分布不匹配的问题，我们在单模态表示上实现分布变换层来规范单模态分布。这样可以减少分布间隙，达到更好的融合效果。对两个广泛使用的数据集进行的大量实验表明，AMML 实现了最先进的性能。

更新日期：2022-05-27

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>