Understanding Image Representations by Measuring Their Equivariance and Equivalence,International Journal of Computer Vision

当前位置： X-MOL 学术 › Int. J. Comput. Vis. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Understanding Image Representations by Measuring Their Equivariance and Equivalence
International Journal of Computer Vision ( IF 19.5 ) Pub Date : 2018-05-18 , DOI: 10.1007/s11263-018-1098-y
Karel Lenc ₁ , Andrea Vedaldi ₁

Affiliation

Despite the importance of image representations such as histograms of oriented gradients and deep Convolutional Neural Networks (CNN), our theoretical understanding of them remains limited. Aimed at filling this gap, we investigate two key mathematical properties of representations: equivariance and equivalence. Equivariance studies how transformations of the input image are encoded by the representation, invariance being a special case where a transformation has no effect. Equivalence studies whether two representations, for example two different parameterizations of a CNN, two different layers, or two different CNN architectures, share the same visual information or not. A number of methods to establish these properties empirically are proposed, including introducing transformation and stitching layers in CNNs. These methods are then applied to popular representations to reveal insightful aspects of their structure, including clarifying at which layers in a CNN certain geometric invariances are achieved and how various CNN architectures differ. We identify several predictors of geometric and architectural compatibility, including the spatial resolution of the representation and the complexity and depth of the models. While the focus of the paper is theoretical, direct applications to structured-output regression are demonstrated too.

中文翻译：

通过测量图像表示的等方差和等价性来理解图像表示

尽管图像表示（例如定向梯度直方图和深度卷积神经网络（CNN））很重要，但我们对它们的理论理解仍然有限。为了填补这一空白，我们研究了表示的两个关键数学属性：等变性和等价性。等方差研究输入图像的变换如何通过表示进行编码，不变性是变换无效的特殊情况。等价性研究两种表示形式（例如 CNN 的两种不同参数化、两个不同的层或两种不同的 CNN 架构）是否共享相同的视觉信息。人们提出了许多凭经验建立这些属性的方法，包括在 CNN 中引入变换和拼接层。然后将这些方法应用于流行的表示形式，以揭示其结构的深刻见解，包括阐明在 CNN 的哪些层实现了某些几何不变性以及各种 CNN 架构有何不同。我们确定了几何和建筑兼容性的几个预测因素，包括表示的空间分辨率以及模型的复杂性和深度。虽然本文的重点是理论，但也演示了结构化输出回归的直接应用。

更新日期：2018-05-18

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>