Bounding the delays of the MPPA network-on-chip with network calculus: Models and benchmarks,Performance Evaluation

当前位置： X-MOL 学术 › Perform. Eval. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Bounding the delays of the MPPA network-on-chip with network calculus: Models and benchmarks
Performance Evaluation ( IF 2.2 ) Pub Date : 2020-11-01 , DOI: 10.1016/j.peva.2020.102124
Marc Boyer , Amaury Graillat , Benoît Dupont de Dinechin , Jörn Migge

Abstract The Kalray MPPA2-256 processor integrates 256 processing cores and 32 management cores on a chip. These cores are grouped into clusters and clusters are connected by a high-performance network on chip (NoC). This NoC provides hardware mechanisms (ingress traffic limiters) that can be configured to offer service guarantees. This paper introduces a network calculus formulation, designed to configure the NoC traffic limiters, that also computes guarantee upper bounds on the NoC traversal latencies. This network calculus formulation accounts for the traffic shaping performed by the NoC links, and can be solved using linear programming. This paper then shows how existing network calculus approaches (the Separated Flow Analysis — SFA ; the Total Flow Analysis — TFA ; the Linear Programming approach — LP) can be adapted to analyze this NoC. The delay bounds obtained by the four approaches are then compared on two case studies: a small configuration coming from a previous study, and a realistic configuration with 128 or 256 flows. From these cases studies, it appears that modeling the shaping introduced by NoC links is of major importance to get accurate bounds. And when all packets have the same size, modeling it reduces the bound by 20%–25% on average.

中文翻译：

使用网络演算限制 MPPA 片上网络的延迟：模型和基准

摘要 Kalray MPPA2-256处理器在一个芯片上集成了256个处理核心和32个管理核心。这些核心被分组到集群中，集群通过高性能片上网络 (NoC) 连接。此 NoC 提供可配置为提供服务保证的硬件机制（入口流量限制器）。本文介绍了一种网络演算公式，旨在配置 NoC 流量限制器，它还计算 NoC 遍历延迟的保证上限。该网络演算公式考虑了 NoC 链路执行的流量整形，并且可以使用线性规划解决。然后，本文展示了现有的网络演算方法（分离流分析 — SFA；总流分析 — TFA；线性规划方法 — LP）如何适用于分析此 NoC。然后将通过四种方法获得的延迟界限在两个案例研究中进行比较：来自先前研究的小配置，以及具有 128 或 256 个流的实际配置。从这些案例研究中，似乎对 NoC 链接引入的整形进行建模对于获得准确的边界非常重要。当所有数据包具有相同的大小时，建模将平均减少 20%–25% 的界限。

更新日期：2020-11-01

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>