Data augmentation for thermal infrared object detection with cascade pyramid generative adversarial network,Applied Intelligence

当前位置： X-MOL 学术 › Appl. Intell. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Data augmentation for thermal infrared object detection with cascade pyramid generative adversarial network
Applied Intelligence ( IF 5.3 ) Pub Date : 2021-05-14 , DOI: 10.1007/s10489-021-02445-9
Xuerui Dai , Xue Yuan , Xueye Wei

Object detection based on convolutional neural network (CNN) should be trained effectively with much data. Data augmentation techniques devote to generate more data, which can enhance the generalization ability and robustness of detection network. For object detection in thermal infrared (TIR) images, objects are difficult to label because of the heavy noise and low resolution. So, it is highly recommended for us to do data augmentation. However, traditional data augmentation strategies (such as image flipping, random color jittering) only produce limited training samples. In order to generate images with high resolution, and ensure they are subject to the distribution of real samples, generative adversarial network (GAN) is introduced. To generate high-resolution samples, image pyramids are input into different branches, then these cascade features are fused to gradually improve the resolution. For the sake of improving the discriminant capability of discriminator, the feature matching loss is calculated when training. And the generated images with different resolutions are discriminated in multiple stages. The data augmentation algorithm proposed in this paper is called cascade pyramid generative adversarial network (CPGAN). No matter on the KAIST Multispectral data set or OSU thermal-color data set, with our CPGAN, the detection accuracy of classical detection algorithms is greatly improved. In addition, the detection speed remains entirely unaffected because CPGAN only exists in the training phase.

中文翻译：

级联金字塔生成对抗网络用于热红外目标检测的数据扩充

基于卷积神经网络（CNN）的目标检测应该有效地训练大量数据。数据增强技术致力于产生更多的数据，可以增强检测网络的泛化能力和鲁棒性。对于热红外（TIR）图像中的对象检测，由于噪声大且分辨率低，因此很难标记对象。因此，强烈建议我们进行数据扩充。但是，传统的数据增强策略（例如图像翻转，随机色彩抖动）仅会生成有限的训练样本。为了生成高分辨率的图像并确保其服从真实样本的分布，引入了生成对抗网络（GAN）。为了生成高分辨率样本，将图像金字塔输入到不同的分支中，然后融合这些级联功能以逐渐提高分辨率。为了提高判别器的判别能力，在训练时计算特征匹配损失。并且所生成的具有不同分辨率的图像被分为多个阶段。本文提出的数据增强算法称为级联金字塔生成对抗网络（CPGAN）。无论是在KAIST多光谱数据集还是OSU热色数据集上，借助我们的CPGAN，经典检测算法的检测精度都得到了极大的提高。另外，由于CPGAN仅存在于训练阶段，因此检测速度完全不受影响。训练时计算特征匹配损失。并且所生成的具有不同分辨率的图像被分为多个阶段。本文提出的数据增强算法称为级联金字塔生成对抗网络（CPGAN）。无论是在KAIST多光谱数据集还是OSU热色数据集上，借助我们的CPGAN，经典检测算法的检测精度都得到了极大的提高。另外，由于CPGAN仅存在于训练阶段，因此检测速度完全不受影响。训练时计算特征匹配损失。并且所生成的具有不同分辨率的图像被分为多个阶段。本文提出的数据增强算法称为级联金字塔生成对抗网络（CPGAN）。无论是在KAIST多光谱数据集还是OSU热色数据集上，借助我们的CPGAN，经典检测算法的检测精度都得到了极大的提高。另外，由于CPGAN仅存在于训练阶段，因此检测速度完全不受影响。大大提高了经典检测算法的检测精度。另外，由于CPGAN仅存在于训练阶段，因此检测速度完全不受影响。大大提高了经典检测算法的检测精度。另外，由于CPGAN仅存在于训练阶段，因此检测速度完全不受影响。

更新日期：2021-05-14

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>