当前位置: X-MOL 学术arXiv.cs.AR › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Fast Modeling L2 Cache Reuse Distance Histograms Using Combined Locality Information from Software Traces
arXiv - CS - Hardware Architecture Pub Date : 2019-07-11 , DOI: arxiv-1907.05068
Ming Ling, Jiancong Ge, Guangmin Wang

To mitigate the performance gap between CPU and the main memory, multi-level cache architectures are widely used in modern processors. Therefore, modeling the behaviors of the downstream caches becomes a critical part of the processor performance evaluation in the early stage of Design Space Exploration (DSE). In this paper, we propose a fast and accurate L2 cache reuse distance histogram model, which can be used to predict the behaviors of the multi-level cache architectures where the L1 cache uses the LRU replacement policy and the L2 cache uses LRU/Random replacement policies. We use the profiled L1 reuse distance histogram and two newly proposed metrics, namely the RST table and the Hit-RDH, that describing more detailed information of the software traces as the inputs. For a given L1 cache configuration, the profiling results can be reused for different configurations of the L2 cache. The output of our model is the L2 cache reuse distance histogram, based on which the L2 cache miss rates can be evaluated. We compare the L2 cache miss rates with the results from gem5 cycle-accurate simulations of 15 benchmarks chosen from SPEC CPU 2006 and 9 benchmarks from SPEC CPU 2017. The average absolute error is less than 5%, while the evaluation time for each L2 configuration can be sped up almost 30X for four L2 cache candidates.

中文翻译:

使用来自软件跟踪的组合位置信息快速建模 L2 缓存重用距离直方图

为了缩小 CPU 和主内存之间的性能差距,现代处理器中广泛使用了多级缓存架构。因此,在设计空间探索 (DSE) 的早期阶段,对下游缓存的行为进行建模成为处理器性能评估的关键部分。在本文中,我们提出了一种快速准确的 L2 缓存重用距离直方图模型,该模型可用于预测 L1 缓存使用 LRU 替换策略和 L2 缓存使用 LRU/随机替换的多级缓存架构的行为政策。我们使用分析的 L1 重用距离直方图和两个新提出的度量,即 RST 表和 Hit-RDH,它们描述了软件跟踪的更详细信息作为输入。对于给定的 L1 缓存配置,分析结果可以重复用于 L2 缓存的不同配置。我们模型的输出是 L2 缓存重用距离直方图,基于该直方图可以评估 L2 缓存未命中率。我们将 L2 缓存未命中率与来自 SPEC CPU 2006 的 15 个基准和来自 SPEC CPU 2017 的 9 个基准的 gem5 周期精确模拟的结果进行比较。平均绝对误差小于 5%,而每个 L2 配置的评估时间对于四个 L2 缓存候选,可以将速度提高近 30 倍。
更新日期:2020-10-13
down
wechat
bug