当前位置: X-MOL 学术IEEE Trans. Neural Netw. Learn. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
PID Controller-Based Stochastic Optimization Acceleration for Deep Neural Networks.
IEEE Transactions on Neural Networks and Learning Systems ( IF 10.4 ) Pub Date : 2020-11-30 , DOI: 10.1109/tnnls.2019.2963066
Haoqian Wang , Yi Luo , Wangpeng An , Qingyun Sun , Jun Xu , Lei Zhang

Deep neural networks (DNNs) are widely used and demonstrated their power in many applications, such as computer vision and pattern recognition. However, the training of these networks can be time consuming. Such a problem could be alleviated by using efficient optimizers. As one of the most commonly used optimizers, stochastic gradient descent-momentum (SGD-M) uses past and present gradients for parameter updates. However, in the process of network training, SGD-M may encounter some drawbacks, such as the overshoot phenomenon. This problem would slow the training convergence. To alleviate this problem and accelerate the convergence of DNN optimization, we propose a proportional-integral-derivative (PID) approach. Specifically, we investigate the intrinsic relationships between the PID-based controller and SGD-M first. We further propose a PID-based optimization algorithm to update the network parameters, where the past, current, and change of gradients are exploited. Consequently, our proposed PID-based optimization alleviates the overshoot problem suffered by SGD-M. When tested on popular DNN architectures, it also obtains up to 50% acceleration with competitive accuracy. Extensive experiments about computer vision and natural language processing demonstrate the effectiveness of our method on benchmark data sets, including CIFAR10, CIFAR100, Tiny-ImageNet, and PTB. We have released the code at https://github.com/tensorboy/PIDOptimizer.

中文翻译:

用于深度神经网络的基于 PID 控制器的随机优化加速。

深度神经网络 (DNN) 被广泛使用并在许多应用中展示了它们的强大功能,例如计算机视觉和模式识别。然而,这些网络的训练可能非常耗时。这样的问题可以通过使用高效的优化器来缓解。作为最常用的优化器之一,随机梯度下降动量 (SGD-M) 使用过去和现在的梯度进行参数更新。但是,在网络训练过程中,SGD-M 可能会遇到一些弊端,比如过冲现象。这个问题会减慢训练收敛速度。为了缓解这个问题并加速 DNN 优化的收敛,我们提出了一种比例积分微分 (PID) 方法。具体来说,我们首先研究基于 PID 的控制器和 SGD-M 之间的内在关系。我们进一步提出了一种基于 PID 的优化算法来更新网络参数,其中利用了过去、当前和梯度的变化。因此,我们提出的基于 PID 的优化减轻了 SGD-M 所遭受的过冲问题。在流行的 DNN 架构上进行测试时,它还以具有竞争力的精度获得了高达 50% 的加速。关于计算机视觉和自然语言处理的大量实验证明了我们的方法在基准数据集上的有效性,包括 CIFAR10、CIFAR100、Tiny-ImageNet 和 PTB。我们已经在 https://github.com/tensorboy/PIDOptimizer 上发布了代码。在流行的 DNN 架构上进行测试时,它还以具有竞争力的精度获得了高达 50% 的加速。关于计算机视觉和自然语言处理的大量实验证明了我们的方法在基准数据集上的有效性,包括 CIFAR10、CIFAR100、Tiny-ImageNet 和 PTB。我们已经在 https://github.com/tensorboy/PIDOptimizer 发布了代码。在流行的 DNN 架构上进行测试时,它还以具有竞争力的精度获得了高达 50% 的加速。关于计算机视觉和自然语言处理的大量实验证明了我们的方法在基准数据集上的有效性,包括 CIFAR10、CIFAR100、Tiny-ImageNet 和 PTB。我们已经在 https://github.com/tensorboy/PIDOptimizer 发布了代码。
更新日期:2020-01-28
down
wechat
bug