当前位置:
X-MOL 学术
›
arXiv.cs.AI
›
论文详情
Our official English website, www.x-mol.net, welcomes your
feedback! (Note: you will need to create a separate account there.)
A Use of Even Activation Functions in Neural Networks
arXiv - CS - Artificial Intelligence Pub Date : 2020-11-23 , DOI: arxiv-2011.11713 Fuchang Gao, Boyu Zhang
arXiv - CS - Artificial Intelligence Pub Date : 2020-11-23 , DOI: arxiv-2011.11713 Fuchang Gao, Boyu Zhang
Despite broad interest in applying deep learning techniques to scientific
discovery, learning interpretable formulas that accurately describe scientific
data is very challenging because of the vast landscape of possible functions
and the "black box" nature of deep neural networks. The key to success is to
effectively integrate existing knowledge or hypotheses about the underlying
structure of the data into the architecture of deep learning models to guide
machine learning. Currently, such integration is commonly done through
customization of the loss functions. Here we propose an alternative approach to
integrate existing knowledge or hypotheses of data structure by constructing
custom activation functions that reflect this structure. Specifically, we study
a common case when the multivariate target function $f$ to be learned from the
data is partially exchangeable, \emph{i.e.} $f(u,v,w)=f(v,u,w)$ for $u,v\in
\mathbb{R}^d$. For instance, these conditions are satisfied for the
classification of images that is invariant under left-right flipping. Through
theoretical proof and experimental verification, we show that using an even
activation function in one of the fully connected layers improves neural
network performance. In our experimental 9-dimensional regression problems,
replacing one of the non-symmetric activation functions with the designated
"Seagull" activation function $\log(1+x^2)$ results in substantial improvement
in network performance. Surprisingly, even activation functions are seldom used
in neural networks. Our results suggest that customized activation functions
have great potential in neural networks.
中文翻译:
偶数激活函数在神经网络中的使用
尽管将深度学习技术应用于科学发现引起了广泛兴趣,但由于可能功能的广阔前景以及深度神经网络的“黑匣子”性质,学习准确描述科学数据的可解释公式仍然非常具有挑战性。成功的关键是将有关数据底层结构的现有知识或假设有效地整合到深度学习模型的体系结构中,以指导机器学习。目前,这种集成通常是通过定制损失函数来完成的。在这里,我们提出了一种替代方法,通过构建反映该结构的自定义激活函数来集成数据结构的现有知识或假设。特别,我们研究了一种常见情况,当要从数据中学习的多元目标函数$ f $可部分交换时,\ emph {ie} $ f(u,v,w)= f(v,u,w)$ ,v \ in \ mathbb {R} ^ d $。例如,对于在左右翻转下不变的图像分类,满足了这些条件。通过理论证明和实验验证,我们表明在完全连接的层之一中使用均匀激活函数可以提高神经网络的性能。在我们的实验性9维回归问题中,将非对称激活函数之一替换为指定的“海鸥”激活函数$ \ log(1 + x ^ 2)$会大大改善网络性能。令人惊讶的是,即使激活函数也很少在神经网络中使用。
更新日期:2020-11-25
中文翻译:
偶数激活函数在神经网络中的使用
尽管将深度学习技术应用于科学发现引起了广泛兴趣,但由于可能功能的广阔前景以及深度神经网络的“黑匣子”性质,学习准确描述科学数据的可解释公式仍然非常具有挑战性。成功的关键是将有关数据底层结构的现有知识或假设有效地整合到深度学习模型的体系结构中,以指导机器学习。目前,这种集成通常是通过定制损失函数来完成的。在这里,我们提出了一种替代方法,通过构建反映该结构的自定义激活函数来集成数据结构的现有知识或假设。特别,我们研究了一种常见情况,当要从数据中学习的多元目标函数$ f $可部分交换时,\ emph {ie} $ f(u,v,w)= f(v,u,w)$ ,v \ in \ mathbb {R} ^ d $。例如,对于在左右翻转下不变的图像分类,满足了这些条件。通过理论证明和实验验证,我们表明在完全连接的层之一中使用均匀激活函数可以提高神经网络的性能。在我们的实验性9维回归问题中,将非对称激活函数之一替换为指定的“海鸥”激活函数$ \ log(1 + x ^ 2)$会大大改善网络性能。令人惊讶的是,即使激活函数也很少在神经网络中使用。