当前位置: X-MOL 学术IEEE Signal Process. Lett. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Hierarchical Decoding Network Based on Swin Transformer for Detecting Salient Objects in RGB-T Images
IEEE Signal Processing Letters ( IF 3.2 ) Pub Date : 2022-07-29 , DOI: 10.1109/lsp.2022.3194843
Fan Sun 1 , Wujie Zhou 1 , Lv Ye 1 , Lu Yu 2
Affiliation  

Although conventional deep convolutional neural networks are effective for contextual semantic segmentation of objects, recent vision transformers can capture global information of an image and are better at capturing semantic associations over longer ranges. In addition, some existing saliency detection methods disregard the guidance of high-level semantic information for low-level features during decoding, and only use layer-by-layer transmission for encoding. Therefore, we propose a hierarchical decoding network based on a swin transformer to perform red–green–blue and thermal (RGB-T) salient object detection (SOD). First, a sine–cosine fusion module performs multimodality intersections and exploits complementarity. As a second fusion stage, an advanced semantic information guidance module adjusts high-level semantic information and low-level detailed characteristics. Finally, a global saliency perception module fuses cross-layer information in a top-down path. Comprehensive experiments demonstrate that the proposed network outperforms 12 state-of-the-art methods on three RGB-T SOD datasets.

中文翻译:

基于 Swin Transformer 的分层解码网络检测 RGB-T 图像中的显着目标

尽管传统的深度卷积神经网络对对象的上下文语义分割很有效,但最近的视觉转换器可以捕获图像的全局信息,并且更擅长捕获更长范围内的语义关联。此外,现有的一些显着性检测方法在解码时忽略了高层语义信息对低层特征的引导,只使用逐层传输进行编码。因此,我们提出了一种基于 swin 变换器的分层解码网络来执行红-绿-蓝和热 (RGB-T) 显着目标检测 (SOD)。首先,正余弦融合模块执行多模态交叉并利用互补性。作为第二个融合阶段,高级语义信息引导模块调整高级语义信息和低级详细特征。最后,全局显着性感知模块以自上而下的路径融合跨层信息。综合实验表明,所提出的网络在三个 RGB-T SOD 数据集上优于 12 种最先进的方法。
更新日期:2022-07-29
down
wechat
bug