Cache-conscious off-line real-time scheduling for multi-core platforms: algorithms and implementation,Real-Time Systems

当前位置： X-MOL 学术 › Real-Time Syst. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Cache-conscious off-line real-time scheduling for multi-core platforms: algorithms and implementation
Real-Time Systems ( IF 1.3 ) Pub Date : 2019-03-06 , DOI: 10.1007/s11241-019-09333-z
Viet Anh Nguyen , Damien Hardy , Isabelle Puaut

Most schedulability analysis techniques for multi-core architectures assume a single worst-case execution time (WCET) per task, which is valid in all execution conditions. This assumption is too pessimistic for parallel applications running on multi-core architectures with local instruction or data caches, for which the WCET of a task depends on the cache contents at the beginning of its execution, itself depending on the tasks that were executed immediately before the task under study. In this paper, we propose two scheduling techniques for multi-core architectures equipped with local instruction and data caches. The two techniques schedule a parallel application modeled as a task graph, and generate a static partitioned non-preemptive schedule, that takes benefit of cache reuse between pairs of consecutive tasks. We propose an exact method, using an integer linear programming formulation, as well as a heuristic method based on list scheduling. The efficiency of the techniques is demonstrated through an implementation of these cache-conscious schedules on a real multi-core hardware: a 16-core cluster of the Kalray MPPA-256, Andey generation. We point out implementation issues that arise when implementing the schedules on this particular platform. In addition, we propose strategies to adapt the schedules to the identified implementation factors. An experimental evaluation reveals that our proposed scheduling methods significantly reduce the length of schedules as compared to cache-agnostic scheduling methods. Furthermore, our experiments show that among the identified implementation factors, shared bus contention has the most impact.

中文翻译：

面向多核平台的缓存感知离线实时调度：算法与实现

大多数多核架构的可调度性分析技术都假设每个任务有一个最坏情况执行时间 (WCET)，这在所有执行条件下都是有效的。这种假设对于运行在具有本地指令或数据缓存的多核架构上的并行应用程序来说过于悲观，因为任务的 WCET 取决于其执行开始时的缓存内容，其本身取决于紧接在之前执行的任务正在研究的任务。在本文中，我们为配备本地指令和数据缓存的多核架构提出了两种调度技术。这两种技术调度建模为任务图的并行应用程序，并生成静态分区非抢占式调度，这利用了连续任务对之间的缓存重用。我们提出了一个精确的方法，使用整数线性规划公式，以及基于列表调度的启发式方法。这些技术的效率通过在真正的多核硬件上实现这些缓存敏感调度来证明：Kalray MPPA-256 的 16 核集群，Andey 一代。我们指出在这个特定平台上实施时间表时出现的实施问题。此外，我们提出了使时间表适应已确定的实施因素的策略。实验评估表明，与缓存不可知的调度方法相比，我们提出的调度方法显着减少了调度的长度。此外，我们的实验表明，在确定的实施因素中，共享总线争用的影响最大。以及基于列表调度的启发式方法。这些技术的效率通过在真正的多核硬件上实现这些缓存敏感调度来证明：Kalray MPPA-256 的 16 核集群，Andey 一代。我们指出在这个特定平台上实施时间表时出现的实施问题。此外，我们提出了使时间表适应已确定的实施因素的策略。实验评估表明，与缓存不可知的调度方法相比，我们提出的调度方法显着减少了调度的长度。此外，我们的实验表明，在确定的实施因素中，共享总线争用的影响最大。以及基于列表调度的启发式方法。这些技术的效率通过在真正的多核硬件上实现这些缓存敏感调度来证明：Kalray MPPA-256 的 16 核集群，Andey 一代。我们指出在这个特定平台上实施时间表时出现的实施问题。此外，我们提出了使时间表适应已确定的实施因素的策略。实验评估表明，与缓存不可知的调度方法相比，我们提出的调度方法显着减少了调度的长度。此外，我们的实验表明，在确定的实施因素中，共享总线争用的影响最大。这些技术的效率通过在真正的多核硬件上实现这些缓存敏感调度来证明：Kalray MPPA-256 的 16 核集群，Andey 一代。我们指出在这个特定平台上实施时间表时出现的实施问题。此外，我们提出了使时间表适应已确定的实施因素的策略。实验评估表明，与缓存不可知的调度方法相比，我们提出的调度方法显着减少了调度的长度。此外，我们的实验表明，在确定的实施因素中，共享总线争用的影响最大。这些技术的效率通过在真正的多核硬件上实现这些缓存敏感调度来证明：Kalray MPPA-256 的 16 核集群，Andey 一代。我们指出在这个特定平台上实施时间表时出现的实施问题。此外，我们提出了使时间表适应已确定的实施因素的策略。实验评估表明，与缓存不可知的调度方法相比，我们提出的调度方法显着减少了调度的长度。此外，我们的实验表明，在确定的实施因素中，共享总线争用的影响最大。我们指出在这个特定平台上实施时间表时出现的实施问题。此外，我们提出了使时间表适应已确定的实施因素的策略。实验评估表明，与缓存不可知的调度方法相比，我们提出的调度方法显着减少了调度的长度。此外，我们的实验表明，在确定的实施因素中，共享总线争用的影响最大。我们指出在这个特定平台上实施时间表时出现的实施问题。此外，我们提出了使时间表适应已确定的实施因素的策略。实验评估表明，与缓存不可知的调度方法相比，我们提出的调度方法显着减少了调度的长度。此外，我们的实验表明，在确定的实施因素中，共享总线争用的影响最大。实验评估表明，与缓存不可知的调度方法相比，我们提出的调度方法显着减少了调度的长度。此外，我们的实验表明，在确定的实施因素中，共享总线争用的影响最大。实验评估表明，与缓存不可知的调度方法相比，我们提出的调度方法显着减少了调度的长度。此外，我们的实验表明，在确定的实施因素中，共享总线争用的影响最大。

更新日期：2019-03-06

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>