当前位置: X-MOL 学术J. Syst. Archit. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Run-time adaptive data page mapping: A Comparison with 3D-stacked DRAM cache
Journal of Systems Architecture ( IF 3.7 ) Pub Date : 2020-06-06 , DOI: 10.1016/j.sysarc.2020.101798
Rakesh Pandey , Aryabartta Sahu

In the current chip-multiprocessor era, 3D-stacked DRAM became an attractive alternative to mitigate the DRAM bandwidth wall problem. In a chip-multiprocessor, the 3D-stacked DRAM is architect either (a) to cache both local and remote data or (b) to cache only the local data. Caching only local data into the 3D-stacked DRAM enforces the chip-multiprocessors to suffer inter-node latency overhead while accessing remote data. However, caching both local and remote data onto the 3D-stacked DRAM requires a large coherence directory (tens of MBs) to ensure correctness.

In this paper, we consider a 3D-stacked DRAM based chip-multiprocessor and perform a comparative study between (a) high level adaptive run-time data page mapping onto DRAM with an auxiliary small SRAM buffer as a performance booster, and (b) DRAM used as coherent cache. Our experiment on a 64 core chip-multiprocessor system with 4GB of 3D-stacked DRAM shows that our adaptive run-time data page mapping on DRAM along with an SRAM buffer outperforms the base-case (where DRAM caches only local data) by an average of 48%. Moreover, our method shows a performance improvement by an average of 40% when compared with a recent state-of-art work (where DRAM caches both local and remote data).



中文翻译:

运行时自适应数据页映射:与3D堆栈DRAM缓存的比较

在当前的芯片多处理器时代,3D堆叠DRAM成为缓解DRAM带宽壁问题的有吸引力的替代方案。在芯片多处理器中,3D堆栈DRAM的架构是(a)缓存本地和远程数据,或者(b)仅缓存本地数据。仅将本地数据缓存到3D堆栈的DRAM中会使芯片多处理器在访问远程数据时遭受节点间延迟的开销。但是,将本地和远程数据都缓存到3D堆栈的DRAM上需要一个较大的一致性目录(数十MB)以确保正确性。

在本文中,我们考虑了基于3D堆栈的DRAM的芯片多处理器,并进行了以下比较研究:(a)将高级自适应运行时数据页面映射到具有辅助小SRAM缓冲区作为性能提升器的DRAM,以及(b) DRAM用作相干缓存。我们在具有4GB 3D堆栈DRAM的64核芯片多处理器系统上进行的实验表明,我们在DRAM上的自适应运行时数据页映射以及SRAM缓冲区的性能平均优于基本情况(DRAM仅缓存本地数据)占48%。此外,与最近的最新工作(DRAM缓存本地和远程数据)相比,我们的方法显示出平均40%的性能提升。

更新日期:2020-06-06
down
wechat
bug