Guided Attention in CNNs for Occluded Pedestrian Detection and Re-identification,International Journal of Computer Vision

当前位置： X-MOL 学术 › Int. J. Comput. Vis. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Guided Attention in CNNs for Occluded Pedestrian Detection and Re-identification
International Journal of Computer Vision ( IF 11.6 ) Pub Date : 2021-04-08 , DOI: 10.1007/s11263-021-01461-z
Shanshan Zhang , Di Chen , Jian Yang , Bernt Schiele

Pedestrian detection and re-identification have progressed significantly in the last few years. However, occluded people are notoriously hard to detect and recognize, as their appearance varies substantially depending on a wide range of occlusion patterns. In this paper, we aim to propose a simple and compact method based on CNNs for occlusion handling. We start with interpreting CNN channel features of a pedestrian detector, and we find that different channels activate responses for different body parts respectively. These findings motivate us to employ an attention mechanism across channels to represent various occlusion patterns in one single model, as each occlusion pattern can be formulated as some specific combination of body parts. Therefore, an attention network with self or external guidances is proposed as an add-on to the baseline CNN method. Also, we propose an attention guided self-paced learning method to balance the optimization across different occlusion levels. Our proposed method shows significant improvements over the baseline methods for both pedestrian detection and re-identification tasks. For pedestrian detection, we achieve a considerable improvement of 8pp to the baseline FasterRCNN detector on the heavy occlusion subset of CityPersons and on Caltech we outperform the state-of-the-art method by 5pp. For pedestrian re-identification, our method surpasses the baseline and achieves state-of-the-art performance on multiple re-identification benchmarks.

中文翻译：

在CNN中引导注意的行人检测和重新识别

在最近几年中，行人检测和重新识别取得了显着进展。然而，众所周知，被遮挡的人很难被发现和识别，因为他们的外观会根据各种遮挡模式而发生很大变化。在本文中，我们旨在提出一种基于CNN的简单紧凑的遮挡处理方法。我们从解释行人探测器的CNN通道特征开始，我们发现不同的通道分别激活了不同身体部位的响应。这些发现促使我们在一个单一模型中跨渠道采用注意力机制来表示各种阻塞模式，因为每种阻塞模式都可以表述为身体部位的某些特定组合。所以，建议将具有自我或外部指导的注意力网络作为基线CNN方法的附加。此外，我们提出了一种以注意力为导向的自定进度学习方法，以平衡不同遮挡水平之间的优化。对于行人检测和重新识别任务，我们提出的方法显示出相对于基线方法的显着改进。对于行人检测，在CityPersons的重度阻塞子集上，我们将基线FasterRCNN检测器提高了8pp，而在加州理工学院，我们的性能比最新方法高了5pp。对于行人重新识别，我们的方法超过了基线，并在多个重新识别基准上达到了最先进的性能。我们提出了一种以注意力为导向的自定进度学习方法，以平衡不同遮挡水平之间的优化。对于行人检测和重新识别任务，我们提出的方法显示出相对于基线方法的显着改进。对于行人检测，在CityPersons的重度阻塞子集上，我们将基线FasterRCNN检测器提高了8pp，而在加州理工学院，我们的性能比最新方法高了5pp。对于行人重新识别，我们的方法超过了基线，并在多个重新识别基准上达到了最先进的性能。我们提出了一种以注意力为导向的自定进度学习方法，以平衡不同遮挡水平之间的优化。对于行人检测和重新识别任务，我们提出的方法显示出相对于基线方法的显着改进。对于行人检测，在CityPersons的重度阻塞子集上，我们将基线FasterRCNN检测器提高了8pp，而在加州理工学院，我们的性能比最新方法高了5pp。对于行人重新识别，我们的方法超过了基线，并在多个重新识别基准上达到了最先进的性能。在CityPersons的重度遮挡子集上，我们将基线FasterRCNN检测器的性能提高了8pp，而在加州理工学院，我们的性能比最新方法提高了5pp。对于行人重新识别，我们的方法超过了基线，并在多个重新识别基准上达到了最先进的性能。在CityPersons的重度遮挡子集上，我们将基线FasterRCNN检测器的性能提高了8pp，而在加州理工学院，我们的性能比最新方法提高了5pp。对于行人重新识别，我们的方法超过了基线，并在多个重新识别基准上达到了最先进的性能。

更新日期：2021-04-08

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11