当前位置: X-MOL 学术J. Glob. Optim. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Resolving learning rates adaptively by locating stochastic non-negative associated gradient projection points using line searches
Journal of Global Optimization ( IF 1.8 ) Pub Date : 2020-07-28 , DOI: 10.1007/s10898-020-00921-z
Dominic Kafka , Daniel N. Wilke

Learning rates in stochastic neural network training are currently determined a priori to training, using expensive manual or automated iterative tuning. Attempts to resolve learning rates adaptively, using line searches, have proven computationally demanding. Reducing the computational cost by considering mini-batch sub-sampling (MBSS) introduces challenges due to significant variance in information between batches that may present as discontinuities in the loss function, depending on the MBSS approach. This study proposes a robust approach to adaptively resolve learning rates in dynamic MBSS loss functions. This is achieved by finding sign changes from negative to positive along directional derivatives, which ultimately converge to a stochastic non-negative associated gradient projection point. Through a number of investigative studies, we demonstrate that gradient-only line searches (GOLS) resolve learning rates adaptively, improving convergence performance over minimization line searches, ignoring certain local minima and eliminating an otherwise expensive hyperparameter. We also show that poor search directions may benefit computationally from overstepping optima along a descent direction, which can be resolved by considering improved search directions. Having shown that GOLS is a reliable line search allows for comparative investigations between static and dynamic MBSS.



中文翻译:

通过使用线搜索定位随机非负关联的梯度投影点来自适应地解决学习率

当前,使用昂贵的手动或自动迭代调整在确定随机先验神经网络训练中的学习率。事实证明,尝试使用线搜索来自适应地解决学习率问题,对计算的要求很高。通过考虑小批量子采样(MBSS)来降低计算成本会带来挑战,这是由于批次之间的信息差异很大(取决于MBSS方法),这些差异可能会以损失函数的不连续性呈现。这项研究提出了一种鲁棒的方法来自适应地解决动态学习率MBSS丢失功能。这是通过找到符号沿方向导数从负到正的变化来实现的,这些变化最终会收敛到随机的非负关联梯度投影点。通过大量的调查研究,我们证明了仅梯度线搜索(GOLS)可以自适应地解决学习率,与最小化线搜索相比提高了收敛性能,忽略了某些局部最小值,并消除了其他昂贵的超参数。我们还表明,较差的搜索方向可能会沿下降方向越过最优值而在计算上受益,这可以通过考虑改进的搜索方向来解决。表明GOLS是可靠的直线搜索,可以对静态动态进行比较研究 MBSS。

更新日期:2020-07-28
down
wechat
bug