当前位置: X-MOL 学术IEEE Trans. Cybern. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
GLCM: Global-Local Captioning Model for Remote Sensing Image Captioning.
IEEE Transactions on Cybernetics ( IF 11.8 ) Pub Date : 2023-10-17 , DOI: 10.1109/tcyb.2022.3222606
Qi Wang 1 , Wei Huang 1 , Xueting Zhang 1 , Xuelong Li 1
Affiliation  

Remote sensing image captioning (RSIC), which describes a remote sensing image with a semantically related sentence, has been a cross-modal challenge between computer vision and natural language processing. For visual features extracted from remote sensing images, global features provide the complete and comprehensive visual relevance of all the words of a sentence simultaneously, while local features can emphasize the discrimination of these words individually. Therefore, not only global features are important for caption generation but also local features are meaningful for making the words more discriminative. In order to make full use of the advantages of both global and local features, in this article, we propose an attention-based global-local captioning model (GLCM) to obtain global-local visual feature representation for RSIC. Based on the proposed GLCM, the correlation of all the generated words and the relation of each separate word and the most related local visual features can be visualized in a similarity-based manner, which provides more interpretability for RSIC. In the extensive experiments, our method achieves comparable results in UCM-captions and superior results in Sydney-captions and RSICD which is the largest RSIC dataset.

中文翻译:

GLCM:遥感图像描述的全局-局部描述模型。

遥感图像字幕(RSIC)用语义相关的句子描述遥感图像,一直是计算机视觉和自然语言处理之间的跨模态挑战。对于从遥感图像中提取的视觉特征,全局特征同时提供了句子中所有单词完整且全面的视觉相关性,而局部特征可以单独强调这些单词的区分度。因此,不仅全局特征对于字幕生成很重要,而且局部特征对于使单词更具辨别力也有意义。为了充分利用全局和局部特征的优点,在本文中,我们提出了一种基于注意力的全局局部字幕模型(GLCM)来获得RSIC的全局局部视觉特征表示。基于所提出的 GLCM,所有生成单词的相关性以及每个单独单词与最相关的局部视觉特征的关系可以以基于相似性的方式可视化,这为 RSIC 提供了更多的可解释性。在大量的实验中,我们的方法在 UCM 字幕中取得了可比较的结果,在悉尼字幕和最大的 RSIC 数据集 RSICD 中取得了优异的结果。
更新日期:2022-11-29
down
wechat
bug