Jointly Image Annotation and Classification Based on Supervised Multi-Modal Hierarchical Semantic Model,Pattern Recognition and Image Analysis

当前位置： X-MOL 学术 › Pattern Recognit. Image Anal. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Jointly Image Annotation and Classification Based on Supervised Multi-Modal Hierarchical Semantic Model
Pattern Recognition and Image Analysis ( IF 0.7 ) Pub Date : 2020-03-31 , DOI: 10.1134/s1054661820010058
Chun-yan Yin , Yong-Heng Chen , Wan-li Zuo

Abstract

A lot of applications involve capturing correlations from multi-modality data, where available information spans multiple modalities, such as text, images or speech. In this paper, we pay attention to the specific case in which images are both labeled with a category and annotated with free text, and develop a supervised multi-modal hierarchical semantic model (smHSM), where we incorporate image classification into the joint modeling of visual and textual information, for the tasks of image annotation and classification. To evaluate the effectiveness of our model, we experiment our model on two datasets, and compare with other traditional models. The results demonstrate the effectiveness and advantages of our model in caption perplexity, classification accuracy and image annotation accuracy.

中文翻译：

基于监督多模态分层语义模型的联合图像标注与分类

摘要

许多应用程序涉及从多模态数据中捕获相关性，其中可用信息跨越多种模态，例如文本，图像或语音。在本文中，我们关注图像同时被标记为类别和带有自由文本进行注释的特定情况，并开发了一种监督多模式分层语义模型（smHSM），其中将图像分类纳入了联合建模中视觉和文字信息，用于图像注释和分类任务。为了评估模型的有效性，我们在两个数据集上对模型进行了实验，并与其他传统模型进行了比较。结果证明了我们的模型在字幕困惑度，分类准确性和图像注释准确性方面的有效性和优势。

更新日期：2020-03-31

点击分享查看原文

点击收藏

阅读更多本刊最新论文