当前位置: X-MOL 学术arXiv.cs.AI › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
CoCoPIE XGen: A Full-Stack AI-Oriented Optimizing Framework
arXiv - CS - Artificial Intelligence Pub Date : 2022-06-21 , DOI: arxiv-2206.10620
Xiaofeng Li, Bin Ren, Xipeng Shen, Yanzhi Wang

There is a growing demand for shifting the delivery of AI capability from data centers on the cloud to edge or end devices, exemplified by the fast emerging real-time AI-based apps running on smartphones, AR/VR devices, autonomous vehicles, and various IoT devices. The shift has however been seriously hampered by the large growing gap between DNN computing demands and the computing power on edge or end devices. This article presents the design of XGen, an optimizing framework for DNN designed to bridge the gap. XGen takes cross-cutting co-design as its first-order consideration. Its full-stack AI-oriented optimizations consist of a number of innovative optimizations at every layer of the DNN software stack, all designed in a cooperative manner. The unique technology makes XGen able to optimize various DNNs, including those with an extreme depth (e.g., BERT, GPT, other transformers), and generate code that runs several times faster than those from existing DNN frameworks, while delivering the same level of accuracy.

中文翻译:

CoCoPIE XGen:面向 AI 的全栈优化框架

将 AI 功能的交付从云端数据中心转移到边缘或终端设备的需求日益增长,例如在智能手机、AR/VR 设备、自动驾驶汽车和各种设备上运行的基于 AI 的快速新兴应用程序。物联网设备。然而,由于 DNN 计算需求与边缘或终端设备上的计算能力之间日益扩大的差距,这种转变受到了严重阻碍。本文介绍了 XGen 的设计,这是一个 DNN 优化框架,旨在弥合差距。XGen 将横切协同设计作为其首要考虑因素。其面向 AI 的全栈优化包括在 DNN 软件堆栈的每一层进行的许多创新优化,所有这些都以协作的方式设计。独特的技术使 XGen 能够优化各种 DNN,包括那些具有极端深度的 DNN(例如。
更新日期:2022-06-23
down
wechat
bug