当前位置: X-MOL 学术arXiv.cs.AR › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
GrateTile: Efficient Sparse Tensor Tiling for CNN Processing
arXiv - CS - Hardware Architecture Pub Date : 2020-09-18 , DOI: arxiv-2009.08685
Yu-Sheng Lin, Hung Chang Lu, Yang-Bin Tsao, Yi-Min Chih, Wei-Chao Chen, Shao-Yi Chien

We propose GrateTile, an efficient, hardwarefriendly data storage scheme for sparse CNN feature maps (activations). It divides data into uneven-sized subtensors and, with small indexing overhead, stores them in a compressed yet randomly accessible format. This design enables modern CNN accelerators to fetch and decompressed sub-tensors on-the-fly in a tiled processing manner. GrateTile is suitable for architectures that favor aligned, coalesced data access, and only requires minimal changes to the overall architectural design. We simulate GrateTile with state-of-the-art CNNs and show an average of 55% DRAM bandwidth reduction while using only 0.6% of feature map size for indexing storage.

中文翻译:

GrateTile:用于 CNN 处理的高效稀疏张量平铺

我们提出了 GrateTile,这是一种用于稀疏 CNN 特征图(激活)的高效、硬件友好的数据存储方案。它将数据分成大小不均匀的子张量,并以较小的索引开销将它们以压缩但可随机访问的格式存储。这种设计使现代 CNN 加速器能够以平铺处理方式即时获取和解压缩子张量。GrateTile 适用于支持对齐、合并数据访问的架构,并且只需要对整体架构设计进行最少的更改。我们使用最先进的 CNN 模拟 GrateTile,显示平均 55% 的 DRAM 带宽减少,同时仅使用 0.6% 的特征图大小用于索引存储。
更新日期:2020-09-21
down
wechat
bug