当前位置: X-MOL 学术IEEE Comput. Archit. Lett. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
GPU-NEST: Characterizing Energy Efficiency of Multi-GPU Inference Servers
IEEE Computer Architecture Letters ( IF 1.4 ) Pub Date : 2020-09-14 , DOI: 10.1109/lca.2020.3023723
Ali Jahanshahi , Hadi Zamani Sabzi , Chester Lau , Daniel Wong

Cloud inference systems have recently emerged as a solution to the ever-increasing integration of AI-powered applications into the smart devices around us. The wide adoption of GPUs in cloud inference systems has made power consumption a first-order constraint in multi-GPU systems. Thus, to achieve this goal, it is critical to have better insight into the power and performance behaviors of multi-GPU inference system. To this end, we propose GPU-NEST, an energy efficiency characterization methodology for multi-GPU inference systems. As case studies, we examined the challenges presented by, and implications of, multi-GPU scaling, inference scheduling, and non-GPU bottleneck on multi-GPU inference systems' energy efficiency. We found that inference scheduling in particular has great benefits in improving the energy efficiency of multi-GPU scheduling, by as much as 40 percent.

中文翻译:


GPU-NEST:表征多 GPU 推理服务器的能源效率



云推理系统最近作为一种解决方案出现,解决了人工智能驱动的应用程序与我们周围的智能设备不断集成的问题。 GPU 在云推理系统中的广泛采用使得功耗成为多 GPU 系统中的一阶约束。因此,为了实现这一目标,更好地了解多 GPU 推理系统的功耗和性能行为至关重要。为此,我们提出了 GPU-NEST,一种用于多 GPU 推理系统的能效表征方法。作为案例研究,我们研究了多 GPU 扩展、推理调度和非 GPU 瓶颈对多 GPU 推理系统能效带来的挑战和影响。我们发现,推理调度在提高多 GPU 调度的能源效率方面尤其具有巨大的好处,高达 40%。
更新日期:2020-09-14
down
wechat
bug