当前位置: X-MOL 学术IEEE Trans. Parallel Distrib. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
The Deep Learning Compiler: A Comprehensive Survey
IEEE Transactions on Parallel and Distributed Systems ( IF 5.3 ) Pub Date : 2021-03-01 , DOI: 10.1109/tpds.2020.3030548
Mingzhen Li , Yi Liu , Xiaoyan Liu , Qingxiao Sun , Xin You , Hailong Yang , Zhongzhi Luan , Lin Gan , Guangwen Yang , Depei Qian

The difficulty of deploying various deep learning (DL) models on diverse DL hardware has boosted the research and development of DL compilers in the community. Several DL compilers have been proposed from both industry and academia such as Tensorflow XLA and TVM. Similarly, the DL compilers take the DL models described in different DL frameworks as input, and then generate optimized codes for diverse DL hardware as output. However, none of the existing survey has analyzed the unique design architecture of the DL compilers comprehensively. In this article, we perform a comprehensive survey of existing DL compilers by dissecting the commonly adopted design in details, with emphasis on the DL oriented multi-level IRs, and frontend/backend optimizations. We present detailed analysis on the design of multi-level IRs and illustrate the commonly adopted optimization techniques. Finally, several insights are highlighted as the potential research directions of DL compiler. This is the first survey article focusing on the design architecture of DL compilers, which we hope can pave the road for future research towards DL compiler.

中文翻译:

深度学习编译器:综合调查

在不同的 DL 硬件上部署各种深度学习 (DL) 模型的难度推动了社区中 DL 编译器的研究和开发。业界和学术界已经提出了几种 DL 编译器,例如 Tensorflow XLA 和 TVM。类似地,DL 编译器将不同 DL 框架中描述的 DL 模型作为输入,然后为不同的 DL 硬件生成优化代码作为输出。然而,现有的调查都没有全面分析 DL 编译器的独特设计架构。在本文中,我们通过详细剖析常用设计对现有 DL 编译器进行了全面调查,重点是面向 DL 的多级 IR 和前端/后端优化。我们详细分析了多级 IR 的设计,并说明了常用的优化技术。最后,强调了一些见解作为 DL 编译器的潜在研究方向。这是第一篇专注于深度学习编译器设计架构的调研文章,希望能为未来深度学习编译器的研究铺平道路。
更新日期:2021-03-01
down
wechat
bug