当前位置: X-MOL 学术IEEE Trans. Multimedia › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Adaptive Deep Metric Learning for Affective Image Retrieval and Classification
IEEE Transactions on Multimedia ( IF 7.3 ) Pub Date : 2020-01-01 , DOI: 10.1109/tmm.2020.3001527
Xingxu Yao , Dongyu She , Haiwei Zhang , Jufeng Yang , Ming-Ming Cheng , Liang Wang

An image is worth a thousand words. Many researchers have conducted extensive studies for understanding visual emotions, since more users express emotions via images and videos online. However, most existing methods based on convolutional neural networks aim to retrieve and classify affective images in a discrete label space, while ignoring both the hierarchical and complex nature of emotions. On one hand, different from concrete and isolated object concepts (e.g., cat and dog), there exists a hierarchical relationship among emotions. On the other, most widely-used deep methods depend on the representation from fully connected layers, which lacks the essential texture information for recognizing emotions. In this work, we address the above problems via adaptive deep metric learning. Specifically, we design an adaptive sentiment similarity loss, which is able to embed affective images considering the emotion polarity and adjust the margin between different image pairs adaptively. We further exploit the sentiment vector as an effective representation to distinguish affective images utilizing the texture information derived from multiple convolutional layers. Finally, we develop a unified multi-task deep framework to simultaneously optimize both retrieval and classification goals. Extensive and thorough evaluations on four benchmark datasets demonstrate that the proposed framework performs favorably against the state-of-the-art methods.

中文翻译:

用于情感图像检索和分类的自适应深度度量学习

一张图片胜过千言万语。许多研究人员对理解视觉情感进行了广泛的研究,因为越来越多的用户通过在线图像和视频来表达情感。然而,大多数现有的基于卷积神经网络的方法旨在对离散标签空间中的情感图像进行检索和分类,同时忽略了情感的层次性和复杂性。一方面,与具体的、孤立的对象概念(如猫和狗)不同,情感之间存在等级关系。另一方面,最广泛使用的深度方法依赖于全连接层的表示,缺乏识别情绪的基本纹理信息。在这项工作中,我们通过自适应深度度量学习解决了上述问题。具体来说,我们设计了一个自适应的情感相似度损失,它能够嵌入考虑情感极性的情感图像并自适应地调整不同图像对之间的边距。我们进一步利用情感向量作为一种有效的表示,利用来自多个卷积层的纹理信息来区分情感图像。最后,我们开发了一个统一的多任务深度框架来同时优化检索和分类目标。对四个基准数据集的广泛而彻底的评估表明,所提出的框架与最先进的方法相比表现良好。我们进一步利用情感向量作为一种有效的表示,利用来自多个卷积层的纹理信息来区分情感图像。最后,我们开发了一个统一的多任务深度框架来同时优化检索和分类目标。对四个基准数据集的广泛而彻底的评估表明,所提出的框架与最先进的方法相比表现良好。我们进一步利用情感向量作为一种有效的表示,利用来自多个卷积层的纹理信息来区分情感图像。最后,我们开发了一个统一的多任务深度框架来同时优化检索和分类目标。对四个基准数据集的广泛而彻底的评估表明,所提出的框架与最先进的方法相比表现良好。
更新日期:2020-01-01
down
wechat
bug