Inference, Learning and Attention Mechanisms that Exploit and Preserve Sparsity in CNNs,International Journal of Computer Vision

当前位置： X-MOL 学术 › Int. J. Comput. Vis. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Inference, Learning and Attention Mechanisms that Exploit and Preserve Sparsity in CNNs
International Journal of Computer Vision ( IF 19.5 ) Pub Date : 2020-03-04 , DOI: 10.1007/s11263-020-01302-5
Timo Hackel , Mikhail Usvyatsov , Silvano Galliani , Jan D. Wegner , Konrad Schindler

Convolutional neural networks (CNNs) are a powerful tool for pattern recognition and computer vision, but they do not scale well to higher-dimensional inputs, because of the associated memory demands for storing and manipulating high-dimensional tensors. This work starts from the observation that higher-dimensional data, like for example 3D voxel volumes, are sparsely populated. CNNs naturally lend themselves to densely sampled data, and sophisticated, massively parallel implementations are available. On the contrary, existing frameworks by and large lack the ability to efficiently process sparse data. Here, we introduce a suite of tools that exploit sparsity in both the feature maps and the filter weights of a CNN, and thereby allow for significantly lower memory footprints and computation times than the conventional dense framework, when processing data with a high degree of sparsity. Our scheme provides (i) an efficient GPU implementation of a convolution layer based on direct, sparse convolution, as well as sparse implementations of the ReLU and max -pooling layers; (ii) a filter step within the convolution layer, which we call attention , that prevents fill-in, i.e., the tendency of convolution to rapidly decrease sparsity, and guarantees an upper bound on the computational resources; and (iii) an adaptation of back-propagation that makes it possible to combine our approach with standard learning frameworks, while still benefitting from sparsity in the data as well as the model.

中文翻译：

在 CNN 中利用和保持稀疏性的推理、学习和注意机制

卷积神经网络 (CNN) 是模式识别和计算机视觉的强大工具，但由于存储和操作高维张量的相关内存需求，它们不能很好地扩展到高维输入。这项工作从观察高维数据（例如 3D 体素体积）稀疏填充开始。CNN 自然适用于密集采样的数据，并且可以使用复杂的大规模并行实现。相反，现有框架总体上缺乏有效处理稀疏数据的能力。在这里，我们介绍了一套工具，这些工具在 CNN 的特征映射和过滤器权重中都利用了稀疏性，从而与传统的密集框架相比，显着降低了内存占用和计算时间，在处理高度稀疏的数据时。我们的方案提供 (i) 基于直接稀疏卷积的卷积层的高效 GPU 实现，以及 ReLU 和最大池化层的稀疏实现；(ii) 卷积层内的过滤步骤，我们称之为注意力，防止填充，即卷积快速降低稀疏性的趋势，并保证计算资源的上限；(iii) 反向传播的适应，使我们的方法与标准学习框架相结合成为可能，同时仍然受益于数据和模型的稀疏性。以及 ReLU 和最大池化层的稀疏实现；(ii) 卷积层内的过滤步骤，我们称之为注意力，防止填充，即卷积快速降低稀疏性的趋势，并保证计算资源的上限；(iii) 反向传播的适应，使我们的方法与标准学习框架相结合成为可能，同时仍然受益于数据和模型的稀疏性。以及 ReLU 和最大池化层的稀疏实现；(ii) 卷积层内的过滤步骤，我们称之为注意力，防止填充，即卷积快速降低稀疏性的趋势，并保证计算资源的上限；(iii) 反向传播的适应，使我们的方法与标准学习框架相结合成为可能，同时仍然受益于数据和模型的稀疏性。

更新日期：2020-03-04

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>