当前位置:
X-MOL 学术
›
arXiv.cs.CC
›
论文详情
Our official English website, www.x-mol.net, welcomes your
feedback! (Note: you will need to create a separate account there.)
Implicit Convex Regularizers of CNN Architectures: Convex Optimization of Two- and Three-Layer Networks in Polynomial Time
arXiv - CS - Computational Complexity Pub Date : 2020-06-26 , DOI: arxiv-2006.14798 Tolga Ergen, Mert Pilanci
arXiv - CS - Computational Complexity Pub Date : 2020-06-26 , DOI: arxiv-2006.14798 Tolga Ergen, Mert Pilanci
We study training of Convolutional Neural Networks (CNNs) with ReLU
activations and introduce exact convex optimization formulations with a
polynomial complexity with respect to the number of data samples, the number of
neurons, and data dimension. More specifically, we develop a convex analytic
framework utilizing semi-infinite duality to obtain equivalent convex
optimization problems for several two- and three-layer CNN architectures. We
first prove that two-layer CNNs can be globally optimized via an $\ell_2$ norm
regularized convex program. We then show that three-layer CNN training problems
are equivalent to an $\ell_1$ regularized convex program that encourages
sparsity in the spectral domain. We also extend these results to multi-layer
CNN architectures including three-layer networks with two ReLU layers and
deeper circular convolutions with a single ReLU layer. Furthermore, we present
extensions of our approach to different pooling methods, which elucidates the
implicit architectural bias as convex regularizers.
中文翻译:
CNN 架构的隐式凸正则化器:多项式时间内两层和三层网络的凸优化
我们研究了具有 ReLU 激活的卷积神经网络 (CNN) 的训练,并在数据样本数量、神经元数量和数据维度方面引入了具有多项式复杂度的精确凸优化公式。更具体地说,我们开发了一个凸分析框架,利用半无限对偶来获得几个两层和三层 CNN 架构的等效凸优化问题。我们首先证明两层 CNN 可以通过 $\ell_2$ 范数正则化凸程序进行全局优化。然后,我们证明三层 CNN 训练问题等价于 $\ell_1$ 正则化凸程序,该程序鼓励谱域中的稀疏性。我们还将这些结果扩展到多层 CNN 架构,包括具有两个 ReLU 层的三层网络和具有单个 ReLU 层的更深的循环卷积。此外,我们将我们的方法扩展到不同的池化方法,这将隐式架构偏差阐明为凸正则化器。
更新日期:2020-10-06
中文翻译:
CNN 架构的隐式凸正则化器:多项式时间内两层和三层网络的凸优化
我们研究了具有 ReLU 激活的卷积神经网络 (CNN) 的训练,并在数据样本数量、神经元数量和数据维度方面引入了具有多项式复杂度的精确凸优化公式。更具体地说,我们开发了一个凸分析框架,利用半无限对偶来获得几个两层和三层 CNN 架构的等效凸优化问题。我们首先证明两层 CNN 可以通过 $\ell_2$ 范数正则化凸程序进行全局优化。然后,我们证明三层 CNN 训练问题等价于 $\ell_1$ 正则化凸程序,该程序鼓励谱域中的稀疏性。我们还将这些结果扩展到多层 CNN 架构,包括具有两个 ReLU 层的三层网络和具有单个 ReLU 层的更深的循环卷积。此外,我们将我们的方法扩展到不同的池化方法,这将隐式架构偏差阐明为凸正则化器。