当前位置: X-MOL 学术Int. J. Comput. Vis. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Do Semantic Parts Emerge in Convolutional Neural Networks?
International Journal of Computer Vision ( IF 19.5 ) Pub Date : 2017-10-17 , DOI: 10.1007/s11263-017-1048-0
Abel Gonzalez-Garcia , Davide Modolo , Vittorio Ferrari

Semantic object parts can be useful for several visual recognition tasks. Lately, these tasks have been addressed using Convolutional Neural Networks (CNN), achieving outstanding results. In this work we study whether CNNs learn semantic parts in their internal representation. We investigate the responses of convolutional filters and try to associate their stimuli with semantic parts. We perform two extensive quantitative analyses. First, we use ground-truth part bounding-boxes from the PASCAL-Part dataset to determine how many of those semantic parts emerge in the CNN. We explore this emergence for different layers, network depths, and supervision levels. Second, we collect human judgements in order to study what fraction of all filters systematically fire on any semantic part, even if not annotated in PASCAL-Part. Moreover, we explore several connections between discriminative power and semantics. We find out which are the most discriminative filters for object recognition, and analyze whether they respond to semantic parts or to other image patches. We also investigate the other direction: we determine which semantic parts are the most discriminative and whether they correspond to those parts emerging in the network. This enables to gain an even deeper understanding of the role of semantic parts in the network.

中文翻译:

语义部分会出现在卷积神经网络中吗?

语义对象部分可用于多个视觉识别任务。最近,使用卷积神经网络 (CNN) 解决了这些任务,取得了出色的成果。在这项工作中,我们研究 CNN 是否在其内部表示中学习语义部分。我们研究了卷积滤波器的响应,并尝试将它们的刺激与语义部分联系起来。我们进行了两项广泛的定量分析。首先,我们使用来自 PASCAL-Part 数据集的真实部分边界框来确定这些语义部分中有多少出现在 CNN 中。我们针对不同的层、网络深度和监督级别探索了这种出现。其次,我们收集人工判断以研究所有过滤器的哪一部分系统地触发任何语义部分,即使没有在 PASCAL-Part 中注释。而且,我们探索了判别力和语义之间的几个联系。我们找出哪些是用于对象识别的最具辨别力的过滤器,并分析它们是对语义部分还是对其他图像块做出响应。我们还研究了另一个方向:我们确定哪些语义部分最具辨别力,以及它们是否对应于网络中出现的那些部分。这使得能够更深入地了解语义部分在网络中的作用。我们确定哪些语义部分最具辨别力,以及它们是否对应于网络中出现的那些部分。这使得能够更深入地了解语义部分在网络中的作用。我们确定哪些语义部分最具辨别力,以及它们是否对应于网络中出现的那些部分。这使得能够更深入地了解语义部分在网络中的作用。
更新日期:2017-10-17
down
wechat
bug