CompactNets: Compact Hierarchical Compositional Networks for Visual Recognition,Computer Vision and Image Understanding

当前位置： X-MOL 学术 › Comput. Vis. Image Underst. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

CompactNets: Compact Hierarchical Compositional Networks for Visual Recognition
Computer Vision and Image Understanding ( IF 4.3 ) Pub Date : 2019-10-28 , DOI: 10.1016/j.cviu.2019.102841
Hans Lobel , René Vidal , Alvaro Soto

CNN-based models currently provide state-of-the-art performance in image categorization tasks. While these methods are powerful in terms of representational capacity, they are generally not conceived with explicit means to control complexity. This might lead to scenarios where resources are used in a non-optimal manner, increasing the number of unspecialized or repeated neurons, and overfitting to data. In this work we propose CompactNets, a new approach to visual recognition that learns a hierarchy of shared, discriminative, specialized, and compact representations. CompactNets naturally capture the notion of compositional compactness, a characterization of complexity in compositional models, consisting on using the smallest number of patterns to build a suitable visual representation. We employ a structural regularizer with group-sparse terms in the objective function, that induces on each layer, an efficient and effective use of elements from the layer below. In particular, this allows groups of top-level features to be specialized based on category information. We evaluate CompactNets on the ILSVRC12 dataset, obtaining compact representations and competitive performance, using an order of magnitude less parameters than common CNN-based approaches. We show that CompactNets are able to outperform other group-sparse-based approaches, in terms of performance and compactness. Finally, transfer-learning experiments on small-scale datasets demonstrate high generalization power, providing remarkable categorization performance with respect to alternative approaches.

中文翻译：

CompactNets：用于视觉识别的紧凑分层结构网络

基于CNN的模型目前在图像分类任务中提供最新的性能。尽管这些方法在表示能力方面很强大，但通常并未设想使用明确的方法来控制复杂性。这可能会导致以下情况：非最佳方式使用资源，增加了非专业化或重复的神经元的数量，并且过度拟合了数据。在这项工作中，我们提出了CompactNets，这是一种视觉识别的新方法，可以学习共享，区分，专用和紧凑表示形式的层次结构。CompactNets自然地捕获了组成紧凑性的概念，是组成模型中复杂度的表征，包括使用最少数量的模式来构建合适的视觉表示。我们在目标函数中采用带有组稀疏项的结构化正则化程序，该结构化规则化程序在每一层上引入了对来自下一层的元素的有效利用。特别地，这允许基于类别信息来专门化顶级功能组。我们使用ILSVRC12数据集评估CompactNet，从而获得紧凑的表示形式和竞争性能，所使用的参数要比基于CNN的常见方法少一个数量级。我们证明，在性能和紧凑性方面，CompactNets能够胜过其他基于组稀疏的方法。最后，在小规模数据集上进行的转移学习实验证明了高泛化能力，

更新日期：2020-01-04

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11