Evolving Normalization-Activation Layers,arXiv - CS - Neural and Evolutionary Computing

当前位置： X-MOL 学术 › arXiv.cs.NE › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Evolving Normalization-Activation Layers
arXiv - CS - Neural and Evolutionary Computing Pub Date : 2020-04-06 , DOI: arxiv-2004.02967
Hanxiao Liu, Andrew Brock, Karen Simonyan, Quoc V. Le

Normalization layers and activation functions are fundamental components in deep networks and typically co-locate with each other. Here we propose to design them using an automated approach. Instead of designing them separately, we unify them into a single tensor-to-tensor computation graph, and evolve its structure starting from basic mathematical functions. Examples of such mathematical functions are addition, multiplication and statistical moments. The use of low-level mathematical functions, in contrast to the use of high-level modules in mainstream NAS, leads to a highly sparse and large search space which can be challenging for search methods. To address the challenge, we develop efficient rejection protocols to quickly filter out candidate layers that do not work well. We also use multi-objective evolution to optimize each layer's performance across many architectures to prevent overfitting. Our method leads to the discovery of EvoNorms, a set of new normalization-activation layers with novel, and sometimes surprising structures that go beyond existing design patterns. For example, some EvoNorms do not assume that normalization and activation functions must be applied sequentially, nor need to center the feature maps, nor require explicit activation functions. Our experiments show that EvoNorms work well on image classification models including ResNets, MobileNets and EfficientNets but also transfer well to Mask R-CNN with FPN/SpineNet for instance segmentation and to BigGAN for image synthesis, outperforming BatchNorm and GroupNorm based layers in many cases.

中文翻译：

不断发展的标准化-激活层

归一化层和激活函数是深层网络的基本组成部分，通常彼此位于同一位置。在这里，我们建议使用自动化方法来设计它们。我们不是单独设计它们，而是将它们统一为一个单一的张量到张量计算图，并从基本的数学函数开始演化其结构。这种数学函数的例子是加法、乘法和统计矩。与主流 NAS 中使用高级模块相比，使用低级数学函数会导致高度稀疏和大的搜索空间，这对搜索方法来说可能是一个挑战。为了应对这一挑战，我们开发了高效的拒绝协议来快速过滤掉效果不佳的候选层。我们还使用多目标进化来优化每一层' s 在许多架构中的性能以防止过度拟合。我们的方法导致了 EvoNorms 的发现，这是一组新的标准化激活层，具有超越现有设计模式的新颖且有时令人惊讶的结构。例如，一些 EvoNorms 不假设归一化和激活函数必须顺序应用，也不需要将特征图居中，也不需要显式激活函数。我们的实验表明，EvoNorms 在包括 ResNets、MobileNets 和 EfficientNets 在内的图像分类模型上运行良好，但也可以很好地转移到具有 FPN/SpineNet 的 Mask R-CNN 进行实例分割和转移到 BigGAN 进行图像合成，在许多情况下优于基于 BatchNorm 和 GroupNorm 的层。一组新的标准化激活层，具有超越现有设计模式的新颖且有时令人惊讶的结构。例如，一些 EvoNorms 不假设归一化和激活函数必须顺序应用，也不需要将特征图居中，也不需要显式激活函数。我们的实验表明，EvoNorms 在包括 ResNets、MobileNets 和 EfficientNets 在内的图像分类模型上运行良好，但也可以很好地转移到具有 FPN/SpineNet 的 Mask R-CNN 进行实例分割和转移到 BigGAN 进行图像合成，在许多情况下优于基于 BatchNorm 和 GroupNorm 的层。一组新的标准化激活层，具有超越现有设计模式的新颖且有时令人惊讶的结构。例如，一些 EvoNorm 不假设归一化和激活函数必须顺序应用，也不需要将特征图居中，也不需要显式激活函数。我们的实验表明，EvoNorms 在包括 ResNets、MobileNets 和 EfficientNets 在内的图像分类模型上运行良好，但也可以很好地转移到具有 FPN/SpineNet 的 Mask R-CNN 进行实例分割和转移到 BigGAN 进行图像合成，在许多情况下优于基于 BatchNorm 和 GroupNorm 的层。一些 EvoNorms 不假设归一化和激活函数必须按顺序应用，也不需要使特征图居中，也不需要显式激活函数。我们的实验表明，EvoNorms 在包括 ResNets、MobileNets 和 EfficientNets 在内的图像分类模型上运行良好，但也可以很好地转移到具有 FPN/SpineNet 的 Mask R-CNN 进行实例分割和转移到 BigGAN 进行图像合成，在许多情况下优于基于 BatchNorm 和 GroupNorm 的层。一些 EvoNorms 不假设归一化和激活函数必须按顺序应用，也不需要将特征图居中，也不需要显式激活函数。我们的实验表明，EvoNorms 在包括 ResNets、MobileNets 和 EfficientNets 在内的图像分类模型上运行良好，但也可以很好地转移到具有 FPN/SpineNet 的 Mask R-CNN 进行实例分割和转移到 BigGAN 进行图像合成，在许多情况下优于基于 BatchNorm 和 GroupNorm 的层。

更新日期：2020-07-20

点击分享查看原文

点击收藏

阅读更多本刊最新论文