Implicit Convex Regularizers of CNN Architectures: Convex Optimization of Two- and Three-Layer Networks in Polynomial Time,arXiv - CS - Computational Complexity

当前位置： X-MOL 学术 › arXiv.cs.CC › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Implicit Convex Regularizers of CNN Architectures: Convex Optimization of Two- and Three-Layer Networks in Polynomial Time
arXiv - CS - Computational Complexity Pub Date : 2020-06-26 , DOI: arxiv-2006.14798
Tolga Ergen, Mert Pilanci

We study training of Convolutional Neural Networks (CNNs) with ReLU activations and introduce exact convex optimization formulations with a polynomial complexity with respect to the number of data samples, the number of neurons, and data dimension. More specifically, we develop a convex analytic framework utilizing semi-infinite duality to obtain equivalent convex optimization problems for several two- and three-layer CNN architectures. We first prove that two-layer CNNs can be globally optimized via an $\ell_2$ norm regularized convex program. We then show that three-layer CNN training problems are equivalent to an $\ell_1$ regularized convex program that encourages sparsity in the spectral domain. We also extend these results to multi-layer CNN architectures including three-layer networks with two ReLU layers and deeper circular convolutions with a single ReLU layer. Furthermore, we present extensions of our approach to different pooling methods, which elucidates the implicit architectural bias as convex regularizers.

中文翻译：

CNN 架构的隐式凸正则化器：多项式时间内两层和三层网络的凸优化

我们研究了具有 ReLU 激活的卷积神经网络 (CNN) 的训练，并在数据样本数量、神经元数量和数据维度方面引入了具有多项式复杂度的精确凸优化公式。更具体地说，我们开发了一个凸分析框架，利用半无限对偶来获得几个两层和三层 CNN 架构的等效凸优化问题。我们首先证明两层 CNN 可以通过 $\ell_2$ 范数正则化凸程序进行全局优化。然后，我们证明三层 CNN 训练问题等价于 $\ell_1$ 正则化凸程序，该程序鼓励谱域中的稀疏性。我们还将这些结果扩展到多层 CNN 架构，包括具有两个 ReLU 层的三层网络和具有单个 ReLU 层的更深的循环卷积。此外，我们将我们的方法扩展到不同的池化方法，这将隐式架构偏差阐明为凸正则化器。

更新日期：2020-10-06

点击分享查看原文

点击收藏

阅读更多本刊最新论文