当前位置: X-MOL 学术IEEE Trans. Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Opportunistic Caching in NoC: Exploring Ways to Reduce Miss Penalty
IEEE Transactions on Computers ( IF 3.7 ) Pub Date : 2021-03-31 , DOI: 10.1109/tc.2021.3069968
Abhijit Das , Abhishek Kumar , John Jose , Maurizio Palesi

Due to limited on-chip caching, data-driven applications with large memory footprint encounter frequent cache misses. Such applications suffer from recurring miss penalty when they re-reference recently evicted cache blocks. To meet the worst-case performance requirements, Network-on-Chip (NoC) routers are provisioned with input port buffers. However, recent studies reveal that these buffers remain underutilised except during network congestion. Trace buffers are Design-for-Debug (DfD) hardware employed in NoC routers for post-silicon debug and validation. Nevertheless, they become non-functional once a design goes into production and remain in the routers left unused. In this article, we exploit the underutilised NoC router buffers and the unused trace buffers to store recently evicted cache blocks. While these blocks are stored in the buffers, future re-reference to these blocks can be replied from the NoC router. Such an opportunistic caching of evicted blocks in NoC routers significantly reduce the miss penalty. Experimental analysis shows that the proposed architectures can achieve up to 21 percent (16 percent on average) reduction in miss penalty and 19 percent (14 percent on average) improvement in overall system performance. While we have a negligible area and leakage power overhead of 2.58 and 3.94 percent, respectively, dynamic power reduces by 6.12 percent due to the improvement in performance.

中文翻译:

NoC中的机会缓存:探索减少罚金的方法

由于有限的片上高速缓存,具有大内存占用空间的数据驱动应用程序经常会遇到高速缓存未命中的情况。当这些应用程序重新引用最近退出的高速缓存块时,它们会遭受反复出现的错失惩罚。为了满足最坏情况下的性能要求,片上网络(NoC)路由器配备了输入端口缓冲区。但是,最近的研究表明,这些缓冲区除了网络拥塞期间仍未得到充分利用。跟踪缓冲区是NoC路由器中采用的调试设计(DfD)硬件,用于后硅调试和验证。然而,一旦设计投入生产,它们将无法使用,并保留在未使用的路由器中。在本文中,我们利用未充分利用的NoC路由器缓冲区和未使用的跟踪缓冲区来存储最近退出的缓存块。这些块存储在缓冲区中时,以后可以从NoC路由器回复对这些模块的重新引用。在NoC路由器中对被驱逐的块进行这样的机会性缓存,可以大大减少未命中的损失。实验分析表明,所提出的体系结构可以将误判降低多达21%(平均16%),将整体系统性能提高19%(平均14%)。虽然我们的面积和漏电功率开销分别可以忽略不计2.58%和3.94%,但由于性能的提高,动态功率降低了6.12%。实验分析表明,所提出的体系结构可以将误判降低多达21%(平均16%),将整体系统性能提高19%(平均14%)。尽管我们的面积和泄漏功率开销分别可忽略不计,分别为2.58%和3.94%,但由于性能的提高,动态功耗降低了6.12%。实验分析表明,所提出的体系结构可以将误判降低多达21%(平均16%),将整体系统性能提高19%(平均14%)。尽管我们的面积和泄漏功率开销分别可忽略不计,分别为2.58%和3.94%,但由于性能的提高,动态功耗降低了6.12%。
更新日期:2021-05-25
down
wechat
bug