当前位置: X-MOL 学术arXiv.cs.AI › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Sandwich Batch Normalization
arXiv - CS - Artificial Intelligence Pub Date : 2021-02-22 , DOI: arxiv-2102.11382
Xinyu Gong, Wuyang Chen, Tianlong Chen, Zhangyang Wang

We present Sandwich Batch Normalization (SaBN), an embarrassingly easy improvement of Batch Normalization (BN) with only a few lines of code changes. SaBN is motivated by addressing the inherent feature distribution heterogeneity that one can be identified in many tasks, which can arise from data heterogeneity (multiple input domains) or model heterogeneity (dynamic architectures, model conditioning, etc.). Our SaBN factorizes the BN affine layer into one shared sandwich affine layer, cascaded by several parallel independent affine layers. Concrete analysis reveals that, during optimization, SaBN promotes balanced gradient norms while still preserving diverse gradient directions: a property that many application tasks seem to favor. We demonstrate the prevailing effectiveness of SaBN as a drop-in replacement in four tasks: $\textbf{conditional image generation}$, $\textbf{neural architecture search}$ (NAS), $\textbf{adversarial training}$, and $\textbf{arbitrary style transfer}$. Leveraging SaBN immediately achieves better Inception Score and FID on CIFAR-10 and ImageNet conditional image generation with three state-of-the-art GANs; boosts the performance of a state-of-the-art weight-sharing NAS algorithm significantly on NAS-Bench-201; substantially improves the robust and standard accuracies for adversarial defense; and produces superior arbitrary stylized results. We also provide visualizations and analysis to help understand why SaBN works. Codes are available at https://github.com/VITA-Group/Sandwich-Batch-Normalization.

中文翻译:

三明治批标准化

我们介绍了三明治批处理规范化(SaBN),这是批处理规范化(BN)的一个令人尴尬的轻松改进,只需几行代码更改即可。SaBN的动机是解决可以在许多任务中确定的固有特征分布异质性,这可能是由于数据异质性(多个输入域)或模型异质性(动态体系结构,模型条件等)引起的。我们的SaBN将BN仿射层分解为一个共享的三明治仿射层,并由几个平行的独立仿射层级联。具体分析表明,在优化过程中,SaBN促进了平衡的梯度规范,同时仍保留了不同的梯度方向:许多应用程序任务似乎都喜欢这种特性。我们在以下四个任务中证明了SaBN作为即插即用替代产品的普遍有效性:$ \ textbf {条件图像生成} $,$ \ textbf {神经体系结构搜索} $(NAS),$ \ textbf {对抗训练} $和$ \ textbf {任意样式转换} $。利用SaBN,借助三个最新的GAN,可以立即在CIFAR-10和ImageNet条件图像生成上获得更好的初始分数和FID。大大提高了NAS-Bench-201上最先进的重量共享NAS算法的性能;大大提高了对抗防御的鲁棒性和标准精度;并产生出色的任意风格化结果。我们还提供可视化和分析,以帮助了解SaBN为何起作用。可以从https://github.com/VITA-Group/Sandwich-Batch-Normalization获得代码。利用SaBN,借助三个最新的GAN,可以立即在CIFAR-10和ImageNet条件图像生成上获得更好的初始分数和FID。大大提高了NAS-Bench-201上最先进的重量共享NAS算法的性能;大大提高了对抗防御的鲁棒性和标准精度;并产生出色的任意风格化结果。我们还提供可视化和分析,以帮助了解SaBN为何起作用。可以从https://github.com/VITA-Group/Sandwich-Batch-Normalization获得代码。利用SaBN,借助三个最新的GAN,可以立即在CIFAR-10和ImageNet条件图像生成上获得更好的初始分数和FID。大大提高了NAS-Bench-201上最先进的重量共享NAS算法的性能;大大提高了对抗防御的鲁棒性和标准精度;并产生出色的任意风格化结果。我们还提供可视化和分析,以帮助了解SaBN为何起作用。可以从https://github.com/VITA-Group/Sandwich-Batch-Normalization获得代码。大大提高了对抗防御的鲁棒性和标准精度;并产生出色的任意风格化结果。我们还提供可视化和分析,以帮助了解SaBN为何起作用。可以从https://github.com/VITA-Group/Sandwich-Batch-Normalization获得代码。大大提高了对抗防御的鲁棒性和标准精度;并产生出色的任意风格化结果。我们还提供可视化和分析,以帮助了解SaBN为何起作用。可以从https://github.com/VITA-Group/Sandwich-Batch-Normalization获得代码。
更新日期:2021-02-24
down
wechat
bug