当前位置: X-MOL 学术arXiv.cs.NE › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A Dual Process Model for Optimizing Cross Entropy in Neural Networks
arXiv - CS - Neural and Evolutionary Computing Pub Date : 2021-04-27 , DOI: arxiv-2104.13277
Stefan Jaeger

Minimizing cross-entropy is a widely used method for training artificial neural networks. Many training procedures based on backpropagation use cross-entropy directly as their loss function. Instead, this theoretical essay investigates a dual process model with two processes, in which one process minimizes the Kullback-Leibler divergence while its dual counterpart minimizes the Shannon entropy. Postulating that learning consists of two dual processes complementing each other, the model defines an equilibrium state for both processes in which the loss function assumes its minimum. An advantage of the proposed model is that it allows deriving the optimal learning rate and momentum weight to update network weights for backpropagation. Furthermore, the model introduces the golden ratio and complex numbers as important new concepts in machine learning.

中文翻译:

神经网络中交叉熵优化的双过程模型

最小化交叉熵是训练人工神经网络的一种广泛使用的方法。许多基于反向传播的训练程序直接使用交叉熵作为其损失函数。相反,该理论文章研究了具有两个过程的对偶过程模型,其中一个过程使Kullback-Leibler散度最小,而对偶过程使香农熵最小。假设学习由两个相互补充的双重过程组成,则模型为两个过程定义了一个平衡状态,其中损失函数取其最小值。提出的模型的优点在于,它允许导出最佳学习率和动量权重,以更新网络权重以进行反向传播。此外,该模型引入了黄金分割率和复数作为机器学习中的重要新概念。
更新日期:2021-04-29
down
wechat
bug