当前位置: X-MOL 学术arXiv.cs.CV › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Characterizing signal propagation to close the performance gap in unnormalized ResNets
arXiv - CS - Computer Vision and Pattern Recognition Pub Date : 2021-01-21 , DOI: arxiv-2101.08692
Andrew Brock, Soham De, Samuel L. Smith

Batch Normalization is a key component in almost all state-of-the-art image classifiers, but it also introduces practical challenges: it breaks the independence between training examples within a batch, can incur compute and memory overhead, and often results in unexpected bugs. Building on recent theoretical analyses of deep ResNets at initialization, we propose a simple set of analysis tools to characterize signal propagation on the forward pass, and leverage these tools to design highly performant ResNets without activation normalization layers. Crucial to our success is an adapted version of the recently proposed Weight Standardization. Our analysis tools show how this technique preserves the signal in networks with ReLU or Swish activation functions by ensuring that the per-channel activation means do not grow with depth. Across a range of FLOP budgets, our networks attain performance competitive with the state-of-the-art EfficientNets on ImageNet.

中文翻译:

表征信号传播以弥合非规范化ResNet中的性能差距

批次归一化是几乎所有最新图像分类器中的关键组件,但同时也带来了实际挑战:它打破了批次内训练示例之间的独立性,可能会导致计算和内存开销,并经常导致意外的错误。基于对初始化时深度ResNet的最新理论分析,我们提出了一套简单的分析工具来表征前向信号的传播,并利用这些工具设计高性能的ResNet,而无需激活归一化层。我们成功的关键是最近提出的体重标准化的改编版本。我们的分析工具展示了该技术如何通过确保每通道激活手段不会随深度增长而在具有ReLU或Swish激活功能的网络中保留信号。
更新日期:2021-01-22
down
wechat
bug