当前位置: X-MOL 学术Pattern Recogn. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A Generic Shift-Norm-Activation Approach For Deep Learning
Pattern Recognition ( IF 7.5 ) Pub Date : 2021-01-01 , DOI: 10.1016/j.patcog.2020.107609
Zhi Chen , Pin-Han Ho

Abstract Deep learning has received increasing attention in the last decade. Its amazing success, is partly attributed to the evolution of normalization and activation techniques. However, less works have devoted to explore both modules together. This work, therefore, aims at pushing for a deeper understanding on the effect of normalization and activation together analytically. We design a generic method which integrates both normalization and activation together as a whole, named as the Generic Shift-Normalization-Activation Approach (GSNA), in reserving richer information propagation in neural networks. A rigorous mathematical analysis was performed to investigate the benefits of the designed method, such as its computation complexity, performance potential as well as optimization over trainable parameter initialization. Further, extensive experiments are conducted to demonstrate the superiority and generality of the designed method in many computer vision benchmarking tasks, such as CIFAR-10/100, SVHN, ImageNet32 × 32, etc. To explore its generality, we also conduct some experiments on natural language understanding tasks like text classification, natural language inference, and some variational generative task as well. More interestingly, GSNA can be naturally incorporated into the existing neural networks with arbitrary architectures, demonstrating its generic effectiveness in deep learning field.

中文翻译:

深度学习的通用 Shift-Norm-Activation 方法

摘要 深度学习在过去十年中受到越来越多的关注。其惊人的成功部分归功于规范化和激活技术的发展。然而,很少有工作致力于一起探索这两个模块。因此,这项工作旨在通过分析推动更深入地了解归一化和激活的影响。我们设计了一种通用方法,将归一化和激活作为一个整体集成在一起,称为通用 Shift-Normalization-Activation Approach (GSNA),以在神经网络中保留更丰富的信息传播。进行了严格的数学分析以研究设计方法的好处,例如其计算复杂性、性能潜力以及对可训练参数初始化的优化。更多,进行了大量实验以证明所设计方法在许多计算机视觉基准测试任务中的优越性和通用性,例如 CIFAR-10/100、SVHN、ImageNet32 × 32 等。为了探索其通用性,我们还对自然语言进行了一些实验理解诸如文本分类、自然语言推理和一些变分生成任务等任务。更有趣的是,GSNA 可以自然地融入现有的具有任意架构的神经网络,展示了其在深度学习领域的通用有效性。我们还对自然语言理解任务进行了一些实验,例如文本分类、自然语言推理和一些变分生成任务。更有趣的是,GSNA 可以自然地融入现有的具有任意架构的神经网络中,展示了其在深度学习领域的通用有效性。我们还对自然语言理解任务进行了一些实验,例如文本分类、自然语言推理和一些变分生成任务。更有趣的是,GSNA 可以自然地融入现有的具有任意架构的神经网络,展示了其在深度学习领域的通用有效性。
更新日期:2021-01-01
down
wechat
bug