当前位置: X-MOL 学术IEEE Micro › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A Single-shot Generalized Device Placement for Large Dataflow Graphs
IEEE Micro ( IF 2.8 ) Pub Date : 2020-09-01 , DOI: 10.1109/mm.2020.3015188
Yanqi Zhou 1 , Sudip Roy 1 , Amirali Abdolrashidi 1 , Daniel Lin-Kit Wong 1 , Peter Ma 1 , Qiumin Xu 1 , Azalia Mirhoseini 1 , James Laudon 1
Affiliation  

With increasingly complex neural network architectures and heterogeneous device characteristics, finding a reasonable graph partitioning and device placement strategy is challenging. There have been prior attempts at learned approaches for solving device placement, these approaches are computationally expensive, unable to handle large graphs consisting over 50000 nodes, and do not generalize well to unseen graphs. To address all these limitations, we propose an efficient single-shot, generalized deep RL method (SGDP) based on a scalable sequential attention mechanism over a graph neural network that is transferable to new graphs. On a diverse set of representative deep learning models, our method on average achieves 20% improvement over human placement and 18% improvement over the prior art with 15× faster convergence. We are the first to demonstrate super human performance on 8-layer recurrent neural network language model and 8-layer GNMT consisting of over 50000 nodes, on 8-GPUs. We provide rationales and sensitivity study on model architecture selections.

中文翻译:

大型数据流图的单次广义设备放置

随着越来越复杂的神经网络架构和异构设备特性,找到合理的图形分区和设备放置策略具有挑战性。之前已经尝试学习解决设备放置的方法,这些方法计算成本高,无法处理包含超过 50000 个节点的大型图,并且不能很好地泛化到看不见的图。为了解决所有这些限制,我们提出了一种高效的单次、广义的深度强化学习方法 (SGDP),该方法基于可转移到新图的图神经网络上的可扩展顺序注意机制。在一组不同的代表性深度学习模型上,我们的方法平均比人类放置提高了 20%,比现有技术提高了 18%,收敛速度提高了 15 倍。我们是第一个在 8-GPU 上在 8 层循环神经网络语言模型和由超过 50000 个节点组成的 8 层 GNMT 上展示超人类性能的公司。我们提供了模型架构选择的基本原理和敏感性研究。
更新日期:2020-09-01
down
wechat
bug