当前位置: X-MOL 学术J. Supercomput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Ring-mesh: a scalable and high-performance approach for manycore accelerators
The Journal of Supercomputing ( IF 3.3 ) Pub Date : 2019-12-12 , DOI: 10.1007/s11227-019-03072-5
Somnath Mazumdar , Alberto Scionti

There is increasing number of works addressing the design challenges of fast, scalable solutions for the growing number of new type of applications. Recently, many of the solutions aimed at improving processing element capabilities to speed up the execution of machine learning application domain. However, only a few works focused on the interconnection subsystem as a potential source of performance improvement. Wrapping many cores together offer excellent parallelism, but it brings other challenges (e.g. adequate interconnections). Scalable, power-aware interconnects are required to support such a growing number of processing elements, as well as modern applications. In this paper, we propose a scalable and energy-efficient network-on-chip architecture fusing the advantages of rings as well as the 2D mesh without using any bridge router to provide high performance. A dynamic adaptation mechanism allows to better adapt to the application requirements. Simulation results show efficient power consumption (up to 141.3%\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$141.3\%$$\end{document} saving for connecting 1024 cores), 2×\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$2{\times}$$\end{document} (on average) throughput growth with better scalability (up to 1024 processing elements) compared to popular 2D mesh while tested in multiple statistical traffic pattern scenarios.

中文翻译:

Ring-mesh:多核加速器的可扩展和高性能方法

越来越多的工作解决了针对越来越多的新型应用程序的快速、可扩展解决方案的设计挑战。最近,许多解决方案旨在提高处理元素的能力,以加快机器学习应用领域的执行速度。然而,只有少数作品将互连子系统作为性能改进的潜在来源。将多个内核包装在一起可提供出色的并行性,但也带来了其他挑战(例如,足够的互连)。需要可扩展的功率感知互连来支持越来越多的处理元件以及现代应用程序。在本文中,我们提出了一种可扩展且节能的片上网络架构,在不使用任何桥接路由器的情况下融合了环和 2D 网格的优点,以提供高性能。动态适配机制可以更好地适应应用需求。仿真结果显示有效功耗(最高 141.3%\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \ usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$141.3\%$$\end{document} 保存连接 1024 个内核),
更新日期:2019-12-12
down
wechat
bug