当前位置: X-MOL 学术IEEE Trans. Pattern Anal. Mach. Intell. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Norm-Preservation: Why Residual Networks Can Become Extremely Deep?
IEEE Transactions on Pattern Analysis and Machine Intelligence ( IF 20.8 ) Pub Date : 4-28-2020 , DOI: 10.1109/tpami.2020.2990339
Alireza Zaeemzadeh , Nazanin Rahnavard , Mubarak Shah

Augmenting neural networks with skip connections, as introduced in the so-called ResNet architecture, surprised the community by enabling the training of networks of more than 1,000 layers with significant performance gains. This paper deciphers ResNet by analyzing the effect of skip connections, and puts forward new theoretical results on the advantages of identity skip connections in neural networks. We prove that the skip connections in the residual blocks facilitate preserving the norm of the gradient, and lead to stable back-propagation, which is desirable from optimization perspective. We also show that, perhaps surprisingly, as more residual blocks are stacked, the norm-preservation of the network is enhanced. Our theoretical arguments are supported by extensive empirical evidence. Can we push for extra norm-preservation? We answer this question by proposing an efficient method to regularize the singular values of the convolution operator and making the ResNet’s transition layers extra norm-preserving. Our numerical investigations demonstrate that the learning dynamics and the classification performance of ResNet can be improved by making it even more norm preserving. Our results and the introduced modification for ResNet, referred to as Procrustes ResNets, can be used as a guide for training deeper networks and can also inspire new deeper architectures.

中文翻译:


规范保持:为什么残差网络可以变得非常深?



所谓的 ResNet 架构中引入的具有跳跃连接的增强神经网络让社区感到惊讶,它能够训练超过 1,000 层的网络,并显着提高性能。本文通过分析跳跃连接的影响来解读ResNet,并对恒等跳跃连接在神经网络中的优势提出了新的理论结果。我们证明了残差块中的跳跃连接有助于保持梯度的范数,并导致稳定的反向传播,从优化的角度来看,这是理想的。我们还表明,也许令人惊讶的是,随着更多残差块的堆积,网络的范数保持能力得到增强。我们的理论论点得到了广泛的经验证据的支持。我们可以推动额外的规范保护吗?我们通过提出一种有效的方法来规范卷积算子的奇异值并使 ResNet 的过渡层额外保持范数来回答这个问题。我们的数值研究表明,可以通过使其更加保持规范来提高 ResNet 的学习动态和分类性能。我们的结果和引入的 ResNet 修改(称为 Procrustes ResNets)可以用作训练更深层次网络的指南,也可以激发新的更深层次架构。
更新日期:2024-08-22
down
wechat
bug