Adaptive spatial pixel-level feature fusion network for multispectral pedestrian detection,Infrared Physics & Technology

当前位置： X-MOL 学术 › Infrared Phys. Technol. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Adaptive spatial pixel-level feature fusion network for multispectral pedestrian detection
Infrared Physics & Technology ( IF 3.1 ) Pub Date : 2021-05-07 , DOI: 10.1016/j.infrared.2021.103770
Lei Fu , Wen-bin Gu , Yong-bao Ai , Wei Li , Dong Wang

A pedestrian detector that uses visible and thermal infrared image pairs as the input has better detection performance than a detector that uses only visible image under challenging illumination conditions. With the aim to efficiently and effectively fuse complementary information from visible and thermal infrared images, this paper proposes an adaptive spatial pixel-level feature fusion network called the ASPFF Net, which can adaptively extract spatial pixel-level features from visible and infrared images for fusion. Specifically, first, two light networks with different weights are used to extract multi-scale features of visible and infrared images. Next, for features of the same scale but different modalities, the fusion weights of different spatial positions and pixels in the two feature maps are obtained by the spatial attention module (SAM) and pixel attention module (PAM). The original features of visible and infrared images are recalibrated by the fusion weights, and multi-scale fused feature layers are obtained. Finally, different scales of pedestrians are detected on the fused multi-scale feature layers. Compared with the other recent multispectral pedestrian detectors on the reasonable subset of the KAIST multispectral pedestrian detection dataset, the proposed detector is attractive in balancing speed and accuracy. The extensive experiments on the KAIST dataset demonstrate the effectiveness of the proposed method for the fusion of visible and infrared image in multispectral pedestrian detection.

中文翻译：

用于多光谱行人检测的自适应空间像素级特征融合网络

与仅在可见光照条件下仅使用可见图像的检测器相比，使用可见和热红外图像对作为输入的行人检测器具有更好的检测性能。为了有效和有效地融合可见光和红外图像中的互补信息，本文提出了一种自适应空间像素级特征融合网络ASPFF Net，该网络可以自适应地从可见光和红外图像中提取空间像素级特征进行融合。。具体地，首先，使用具有不同权重的两个光网络来提取可见图像和红外图像的多尺度特征。接下来，对于相同规模但模式不同的特征，通过空间关注模块（SAM）和像素关注模块（PAM）获得两个特征图中不同空间位置和像素的融合权重。可见和红外图像的原始特征通过融合权重进行重新校准，并获得多尺度的融合特征层。最后，在融合的多尺度特征层上检测到不同尺度的行人。与KAIST多光谱行人检测数据集的合理子集上的其他最新多光谱行人检测器相比，该检测器在平衡速度和准确性方面具有吸引力。在KAIST数据集上进行的广泛实验证明了该方法在多光谱行人检测中融合可见光和红外图像的有效性。可见和红外图像的原始特征通过融合权重进行重新校准，并获得多尺度的融合特征层。最后，在融合的多尺度特征层上检测到不同尺度的行人。与KAIST多光谱行人检测数据集的合理子集上的其他最新多光谱行人检测器相比，该检测器在平衡速度和准确性方面具有吸引力。在KAIST数据集上进行的广泛实验证明了该方法在多光谱行人检测中融合可见光和红外图像的有效性。可见和红外图像的原始特征通过融合权重进行重新校准，并获得多尺度的融合特征层。最后，在融合的多尺度特征层上检测到不同尺度的行人。与KAIST多光谱行人检测数据集的合理子集上的其他最新多光谱行人检测器相比，该检测器在平衡速度和准确性方面具有吸引力。在KAIST数据集上进行的广泛实验证明了该方法在多光谱行人检测中融合可见光和红外图像的有效性。在融合的多尺度特征层上检测到不同尺度的行人。与KAIST多光谱行人检测数据集的合理子集上的其他最新多光谱行人检测器相比，该检测器在平衡速度和准确性方面具有吸引力。在KAIST数据集上进行的广泛实验证明了该方法在多光谱行人检测中融合可见光和红外图像的有效性。在融合的多尺度特征层上检测到不同尺度的行人。与KAIST多光谱行人检测数据集的合理子集上的其他最新多光谱行人检测器相比，该检测器在平衡速度和准确性方面具有吸引力。在KAIST数据集上进行的广泛实验证明了该方法在多光谱行人检测中融合可见光和红外图像的有效性。

更新日期：2021-05-17

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11