当前位置: X-MOL 学术IEEE J. Emerg. Sel. Top. Circuits Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Design of a Sparsity-Aware Reconfigurable Deep Learning Accelerator Supporting Various Types of Operations
IEEE Journal on Emerging and Selected Topics in Circuits and Systems ( IF 3.7 ) Pub Date : 2020-09-01 , DOI: 10.1109/jetcas.2020.3015238
Shen-Fu Hsiao , Kun-Chih Chen , Chih-Chien Lin , Hsuan-Jui Chang , Bo-Ching Tsai

The superiority of various Deep Neural Networks (DNN) models, such as Convolutional Neural Networks (CNN), Generative Adversarial Networks (GAN), and Recurrent Neural Networks (RNN), has been proven in various real-world applications and has received much attention. However, different DNN models include various types of operations. For example, CNN models are usually composed of convolutional layers (Conv) and fully-connected layers (FC). Furthermore, some light CNN models such as MobileNet adopt depthwise separable convolution with depthwise convolution (DWC) and pointwise convolution (PWC) to compress the models. In addition to regular convolution, de-convolution (De-Conv) is also widely used in many GAN models. Moreover, many RNN models also employ long-short-term memory (LSTM) to control update of internal states and data. Such a high diversity of various DNN operations poses great design challenges in implementing reconfigurable Deep Learning (DL) accelerators, which can support various types of DNN operations. Most recent DL accelerators focus only on some DNN operations, which lacks computing flexibility. In this paper, by exploiting the sparsity in current DNN models, we design sparsity-aware DL hardware accelerators that can support efficient computation of various DNN operations, including Conv, DeConv, DWC, PWC, FC, and LSTM. Through reconfiguring dataflow and parallelizing different operations, the proposed designs not only improve system performance but also increase hardware utilization with a significant reduction of power consumption in memory accesses and arithmetic computations.

中文翻译:

支持各种类型操作的稀疏感知可重构深度学习加速器的设计

卷积神经网络 (CNN)、生成对抗网络 (GAN) 和循环神经网络 (RNN) 等各种深度神经网络 (DNN) 模型的优越性已在各种实际应用中得到证明,并受到了广泛关注。 . 但是,不同的 DNN 模型包括各种类型的操作。例如,CNN 模型通常由卷积层(Conv)和全连接层(FC)组成。此外,一些轻型 CNN 模型(例如 MobileNet)采用深度可分离卷积与深度卷积(DWC)和逐点卷积(PWC)来压缩模型。除了常规的卷积,反卷积(De-Conv)在很多 GAN 模型中也被广泛使用。此外,许多 RNN 模型还使用长短期记忆 (LSTM) 来控制内部状态和数据的更新。各种 DNN 操作的高度多样性给实现可重构深度学习 (DL) 加速器带来了巨大的设计挑战,该加速器可以支持各种类型的 DNN 操作。最近的 DL 加速器只关注一些缺乏计算灵活性的 DNN 操作。在本文中,通过利用当前 DNN 模型中的稀疏性,我们设计了稀疏感知 DL 硬件加速器,可以支持各种 DNN 操作的高效计算,包括 Conv、DeConv、DWC、PWC、FC 和 LSTM。通过重新配置数据流和并行化不同的操作,所提出的设计不仅提高了系统性能,还提高了硬件利用率,显着降低了内存访问和算术计算的功耗。
更新日期:2020-09-01
down
wechat
bug