当前位置: X-MOL 学术IEEE Trans. Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
HAM: Hotspot-Aware Manager for Improving Communications With 3D-Stacked Memory
IEEE Transactions on Computers ( IF 3.6 ) Pub Date : 2021-03-18 , DOI: 10.1109/tc.2021.3066982
Xi Wang , Antonino Tumeo , John D. Leidel , Jie Li , Yong Chen

Emerging High-Performance Computing (HPC) workloads, such as graph analytics, machine learning, and big data science, are data-intensive. Data-intensive workloads usually present fine-grained memory accesses with limited or no data locality, and thus incur frequent cache misses and low utilization of memory bandwidth. 3D-stacked memory devices such as Hybrid Memory Cube (HMC) and High Bandwidth Memory (HBM) can provide significantly higher bandwidth than conventional memory modules. However, the traditional interfaces and optimization methods for JEDEC DDR devices do not allow to fully exploit the potential performance of 3D-stacked memory with the massive amount of irregular memory accesses of data-intensive applications. In this article, we propose a novel Hotspot-Aware Manager (HAM) infrastructure for 3D-stacked memory devices capable of optimizing memory access streams via request aggregation, hotspot detection, and in-memory prefetching. We present the HAM design and implementation, and simulate it on a system using RISC-V embedded cores with attached HMC devices. We extensively evaluate HAM with over 12 benchmarks and applications representing diverse irregular memory access patterns. The results show that, on average, HAM reduces redundant requests by 37.51 percent and increases the prefetch buffer hit rate by 4.2 times, compared to a baseline streaming prefetcher. On the selected benchmark set, HAM provides performance gains of 21.81 percent in average (up to 34.28 percent), and power savings of 35.07 percent over a standard 3D-stacked memory.

中文翻译:

HAM:热点感知管理器,用于通过3D堆栈式内存改善通信

新兴的高性能计算(HPC)工作负载(如图形分析,机器学习和大数据科学)是数据密集型的。数据密集型工作负载通常会提供粒度有限或没有数据局部性的细粒度内存访问,因此会导致频繁的高速缓存未命中和内存带宽利用率低。混合内存多维数据集(HMC)和高带宽内存(HBM)等3D堆栈存储设备可以提供比传统内存模块更高的带宽。但是,JEDEC DDR设备的传统接口和优化方法无法充分利用3D堆栈存储器的潜在性能,以及对数据密集型应用程序的大量不规则存储器访问。在本文中,我们为3D堆栈存储设备提出了一种新颖的热点感知管理器(HAM)基础架构,该架构能够通过请求聚合,热点检测和内存中预取来优化内存访问流。我们介绍了HAM设计和实现,并在带有RISC-V嵌入式内核和附加HMC设备的系统上对其进行了仿真。我们使用代表各种不规则内存访问模式的12多个基准和应用程序对HAM进行了广泛的评估。结果表明,与基线流预取器相比,HAM平均减少了37.51%的冗余请求,并使预取缓冲区的命中率提高了4.2倍。在选定的基准集上,HAM可提供比标准3D堆栈内存平均提高21.81%的性能(高达34.28%),并节省35.07%的功耗。
更新日期:2021-05-25
down
wechat
bug