Scaling-up Diverse Orthogonal Convolutional Networks with a Paraunitary Framework,arXiv - CS - Numerical Analysis

当前位置： X-MOL 学术 › arXiv.cs.NA › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Scaling-up Diverse Orthogonal Convolutional Networks with a Paraunitary Framework
arXiv - CS - Numerical Analysis Pub Date : 2021-06-16 , DOI: arxiv-2106.09121
Jiahao Su, Wonmin Byeon, Furong Huang

Enforcing orthogonality in neural networks is an antidote for gradient vanishing/exploding problems, sensitivity by adversarial perturbation, and bounding generalization errors. However, many previous approaches are heuristic, and the orthogonality of convolutional layers is not systematically studied: some of these designs are not exactly orthogonal, while others only consider standard convolutional layers and propose specific classes of their realizations. To address this problem, we propose a theoretical framework for orthogonal convolutional layers, which establishes the equivalence between various orthogonal convolutional layers in the spatial domain and the paraunitary systems in the spectral domain. Since there exists a complete spectral factorization of paraunitary systems, any orthogonal convolution layer can be parameterized as convolutions of spatial filters. Our framework endows high expressive power to various convolutional layers while maintaining their exact orthogonality. Furthermore, our layers are memory and computationally efficient for deep networks compared to previous designs. Our versatile framework, for the first time, enables the study of architecture designs for deep orthogonal networks, such as choices of skip connection, initialization, stride, and dilation. Consequently, we scale up orthogonal networks to deep architectures, including ResNet, WideResNet, and ShuffleNet, substantially increasing the performance over the traditional shallow orthogonal networks.

中文翻译：

使用 Paraunitary 框架扩展不同的正交卷积网络

在神经网络中增强正交性是解决梯度消失/爆炸问题、对抗性扰动的敏感性和边界泛化错误的解毒剂。然而，许多以前的方法是启发式的，并且没有系统地研究卷积层的正交性：其中一些设计并不完全正交，而另一些则只考虑标准卷积层并提出其实现的特定类别。为了解决这个问题，我们提出了一个正交卷积层的理论框架，它建立了空间域中各种正交卷积层和谱域中的准系统之间的等价性。由于存在超幺正系统的完全谱分解，任何正交卷积层都可以参数化为空间滤波器的卷积。我们的框架赋予各种卷积层高表达能力，同时保持它们的精确正交性。此外，与以前的设计相比，我们的层对于深度网络具有内存和计算效率。我们的通用框架第一次能够研究深度正交网络的架构设计，例如跳过连接、初始化、步幅和扩张的选择。因此，我们将正交网络扩展到深度架构，包括 ResNet、WideResNet 和 ShuffleNet，大大提高了传统浅层正交网络的性能。我们的框架赋予各种卷积层高表达能力，同时保持它们的精确正交性。此外，与以前的设计相比，我们的层对于深度网络具有内存和计算效率。我们的通用框架第一次能够研究深度正交网络的架构设计，例如跳过连接、初始化、步幅和扩张的选择。因此，我们将正交网络扩展到深度架构，包括 ResNet、WideResNet 和 ShuffleNet，大大提高了传统浅层正交网络的性能。我们的框架赋予各种卷积层高表达能力，同时保持它们的精确正交性。此外，与以前的设计相比，我们的层对于深度网络具有内存和计算效率。我们的通用框架第一次能够研究深度正交网络的架构设计，例如跳过连接、初始化、步幅和扩张的选择。因此，我们将正交网络扩展到深度架构，包括 ResNet、WideResNet 和 ShuffleNet，大大提高了传统浅层正交网络的性能。能够研究深度正交网络的架构设计，例如跳过连接、初始化、步幅和扩张的选择。因此，我们将正交网络扩展到深度架构，包括 ResNet、WideResNet 和 ShuffleNet，大大提高了传统浅层正交网络的性能。能够研究深度正交网络的架构设计，例如跳过连接、初始化、步幅和扩张的选择。因此，我们将正交网络扩展到深度架构，包括 ResNet、WideResNet 和 ShuffleNet，大大提高了传统浅层正交网络的性能。

更新日期：2021-06-18

点击分享查看原文

点击收藏

阅读更多本刊最新论文