当前位置: X-MOL 学术IEEE Intell. Transp. Syst. Mag. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Pedestrian Recognition using Cross-Modality Learning in Convolutional Neural Networks
IEEE Intelligent Transportation Systems Magazine ( IF 3.6 ) Pub Date : 2021-01-01 , DOI: 10.1109/mits.2019.2926364
Danut Ovidiu Pop , Alexandrina Rogozan , Fawzi Nashashibi , Abdelaziz Bensrhair

The combination of multi-modal image fusion schemes with deep learning classification methods, and particularly with Convolutional Neural Networks (CNNs) has achieved remarkable performances in the pedestrian detection field. The late fusion scheme has significantly enhanced the performance of the pedestrian recognition task. In this paper, the late fusion scheme connected with CNN learning is deeply investigated for pedestrian recognition based on the Daimler stereo vision dataset. Thus, an independent CNN for each imaging modality (Intensity, Depth, and Optical Flow) is used before the fusion of the CNN's probabilistic output scores with a Multi-Layer Perceptron which provides the recognition decision. We propose four different learning patterns based on Cross-Modality deep learning of Convolutional Neural Networks: (1) a Particular Cross-Modality Learning; (2) a Separate Cross-Modality Learning; (3) a Correlated Cross-Modality Learning and (4) an Incremental Cross-Modality Learning model. Moreover, we also design a new CNN architecture, called LeNet+, which improves the classification performance not only for each modality classifier, but also for the multi-modality late-fusion scheme. Finally, we propose to learn the LeNet+ model with the incremental cross-modality approach using optimal learning settings, obtained with a K-fold Cross Validation pattern. This method outperforms the state-of-the-art classifier provided with Daimler datasets on both non-occluded and partially-occluded pedestrian tasks.

中文翻译:

在卷积神经网络中使用跨模态学习的行人识别

多模态图像融合方案与深度学习分类方法的结合,特别是与卷积神经网络 (CNN) 的结合,在行人检测领域取得了卓越的表现。后期融合方案显着提高了行人识别任务的性能。在本文中,基于戴姆勒立体视觉数据集,深入研究了与 CNN 学习相关的后期融合方案用于行人识别。因此,在将 CNN 的概率输出分数与提供识别决策的多层感知器融合之前,针对每种成像模式(强度、深度和光流)使用独立的 CNN。我们基于卷积神经网络的跨模态深度学习提出了四种不同的学习模式:(1) 特定的跨模态学习;(2) 单独的跨模态学习;(3) 相关跨模态学习和 (4) 增量跨模态学习模型。此外,我们还设计了一种新的 CNN 架构,称为 LeNet+,它不仅提高了每个模态分类器的分类性能,还提高了多模态后期融合方案的分类性能。最后,我们建议使用最佳学习设置通过增量跨模态方法学习 LeNet+ 模型,该设置是通过 K 折交叉验证模式获得的。该方法在非遮挡和部分遮挡的行人任务上都优于戴姆勒数据集提供的最先进的分类器。我们还设计了一种新的 CNN 架构,称为 LeNet+,它不仅提高了每个模态分类器的分类性能,还提高了多模态后期融合方案的分类性能。最后,我们建议使用最佳学习设置通过增量跨模态方法学习 LeNet+ 模型,该模型是通过 K 折交叉验证模式获得的。该方法在非遮挡和部分遮挡的行人任务上都优于戴姆勒数据集提供的最先进的分类器。我们还设计了一种新的 CNN 架构,称为 LeNet+,它不仅提高了每个模态分类器的分类性能,还提高了多模态后期融合方案的分类性能。最后,我们建议使用最佳学习设置通过增量跨模态方法学习 LeNet+ 模型,该模型是通过 K 折交叉验证模式获得的。该方法在非遮挡和部分遮挡的行人任务上都优于戴姆勒数据集提供的最先进的分类器。使用 K 折交叉验证模式获得。该方法在非遮挡和部分遮挡的行人任务上都优于戴姆勒数据集提供的最先进的分类器。使用 K 折交叉验证模式获得。该方法在非遮挡和部分遮挡的行人任务上都优于戴姆勒数据集提供的最先进的分类器。
更新日期:2021-01-01
down
wechat
bug