A Use of Even Activation Functions in Neural Networks,arXiv - CS - Artificial Intelligence

当前位置： X-MOL 学术 › arXiv.cs.AI › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

A Use of Even Activation Functions in Neural Networks
arXiv - CS - Artificial Intelligence Pub Date : 2020-11-23 , DOI: arxiv-2011.11713
Fuchang Gao, Boyu Zhang

Despite broad interest in applying deep learning techniques to scientific discovery, learning interpretable formulas that accurately describe scientific data is very challenging because of the vast landscape of possible functions and the "black box" nature of deep neural networks. The key to success is to effectively integrate existing knowledge or hypotheses about the underlying structure of the data into the architecture of deep learning models to guide machine learning. Currently, such integration is commonly done through customization of the loss functions. Here we propose an alternative approach to integrate existing knowledge or hypotheses of data structure by constructing custom activation functions that reflect this structure. Specifically, we study a common case when the multivariate target function $f$ to be learned from the data is partially exchangeable, \emph{i.e.} $f(u,v,w)=f(v,u,w)$ for $u,v\in \mathbb{R}^d$. For instance, these conditions are satisfied for the classification of images that is invariant under left-right flipping. Through theoretical proof and experimental verification, we show that using an even activation function in one of the fully connected layers improves neural network performance. In our experimental 9-dimensional regression problems, replacing one of the non-symmetric activation functions with the designated "Seagull" activation function $\log(1+x^2)$ results in substantial improvement in network performance. Surprisingly, even activation functions are seldom used in neural networks. Our results suggest that customized activation functions have great potential in neural networks.

中文翻译：

偶数激活函数在神经网络中的使用

尽管将深度学习技术应用于科学发现引起了广泛兴趣，但由于可能功能的广阔前景以及深度神经网络的“黑匣子”性质，学习准确描述科学数据的可解释公式仍然非常具有挑战性。成功的关键是将有关数据底层结构的现有知识或假设有效地整合到深度学习模型的体系结构中，以指导机器学习。目前，这种集成通常是通过定制损失函数来完成的。在这里，我们提出了一种替代方法，通过构建反映该结构的自定义激活函数来集成数据结构的现有知识或假设。特别，我们研究了一种常见情况，当要从数据中学习的多元目标函数$ f $可部分交换时，\ emph {ie} $ f（u，v，w）= f（v，u，w）$ ，v \ in \ mathbb {R} ^ d $。例如，对于在左右翻转下不变的图像分类，满足了这些条件。通过理论证明和实验验证，我们表明在完全连接的层之一中使用均匀激活函数可以提高神经网络的性能。在我们的实验性9维回归问题中，将非对称激活函数之一替换为指定的“海鸥”激活函数$ \ log（1 + x ^ 2）$会大大改善网络性能。令人惊讶的是，即使激活函数也很少在神经网络中使用。

更新日期：2020-11-25

点击分享查看原文

点击收藏

阅读更多本刊最新论文