当前位置: X-MOL 学术Int. J. Parallel. Program › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Training Deep Nets with Progressive Batch Normalization on Multi-GPUs
International Journal of Parallel Programming ( IF 0.9 ) Pub Date : 2018-12-17 , DOI: 10.1007/s10766-018-0615-5
Lianke Qin , Yifan Gong , Tianqi Tang , Yutian Wang , Jiangming Jin

Batch normalization (BN) enables us to train various deep neural networks faster. However, the training accuracy will be significantly influenced with the decrease of input mini-batch size. To increase the model accuracy, a global mean and variance among all the input batch can be used, nevertheless communication across all devices is required in each BN layer, which reduces the training speed greatly. To address this problem, we propose progressive batch normalization, which can achieve a good balance between model accuracy and efficiency in multiple-GPU training. Experimental results show that our algorithm can obtain significant performance improvement over traditional BN without data synchronization across GPUs, achieving up to 18.4% improvement on training DeepLab for semantic segmentation task across 8 GPUs.

中文翻译:

在多 GPU 上使用渐进式批量归一化训练深度网络

批量归一化 (BN) 使我们能够更快地训练各种深度神经网络。然而,随着输入小批量大小的减小,训练精度将受到显着影响。为了提高模型精度,可以使用所有输入批次的全局均值和方差,但是每个 BN 层都需要跨所有设备进行通信,这大大降低了训练速度。为了解决这个问题,我们提出了渐进式批量归一化,它可以在多 GPU 训练中实现模型精度和效率之间的良好平衡。实验结果表明,我们的算法可以在没有跨GPU数据同步的情况下获得比传统BN显着的性能提升,在训练DeepLab跨8个GPU的语义分割任务上实现了高达18.4%的改进。
更新日期:2018-12-17
down
wechat
bug