A fault recovery protocol for brokers in centralized publish-subscribe systems targeting multiprocessor systems-on-chips,Analog Integrated Circuits and Signal Processing

当前位置： X-MOL 学术 › Analog Integr. Circ. Signal Process. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

A fault recovery protocol for brokers in centralized publish-subscribe systems targeting multiprocessor systems-on-chips
Analog Integrated Circuits and Signal Processing ( IF 1.2 ) Pub Date : 2020-03-25 , DOI: 10.1007/s10470-020-01637-6
Anderson R. P. Domingues , Jean Carlo Hamerski , Alexandre de Morais Amory

Abstract

The publish-subscribe programming model has been an alternative to the design of data-intensive distributed applications in many domains. Recently, this model has been ported to the domain of Multiprocessor Systems-on-Chips, in which applications must use the underlying Network-on-Chip communication infrastructure effectively due to restrictions on the architecture such as low power consumption and limited memory size. In such a scenario, the publish-subscribe model fulfills some of these requirements while providing high-level access to the network hardware to programmers, thus contributing to software quality. However, the publish-subscribe model relies on a single process dedicated to orchestrating the communication at the application level, the broker. Should a broker process crash, the communication between associated nodes may experience delays, downtime, or even inconsistent data. In extreme cases, communication is definitively ruined. Thus, a recovery strategy for brokers in the publish-subscribe model becomes crucial when the application has safety requirements. In this work, we extend a publish-subscribe protocol to add redundancy to brokers’ sensitive data. Besides, we provide a recovery protocol to recover brokers in case of a failure. We also provide analytical models to estimate the communication overhead of our approach. We validate our approach in two distinct MPSoC platforms. The results show that our approach inserts a small memory footprint to the system while providing minimal system downtime during recovery.

中文翻译：

针对集中于多处理器片上系统的发布系统中的代理的故障恢复协议

摘要

在许多领域中，发布-订阅编程模型已成为设计数据密集型分布式应用程序的替代方法。最近，该模型已被移植到芯片多芯片系统领域，在该领域中，由于对架构的限制（例如低功耗和有限的内存大小），应用程序必须有效使用底层的片上网络通信基础架构。在这种情况下，发布-订阅模型满足了其中一些要求，同时为程序员提供了对网络硬件的高级访问，从而提高了软件质量。但是，发布-订阅模型依赖于一个专用于协调应用程序级别（代理）通信的过程。如果经纪人程序崩溃，关联节点之间的通信可能会遇到延迟，停机甚至数据不一致的情况。在极端情况下，通信肯定会被破坏。因此，当应用程序具有安全要求时，发布-订阅模型中的代理恢复策略就变得至关重要。在这项工作中，我们扩展了发布-订阅协议，以增加冗余给经纪人的敏感数据。此外，我们提供了一个恢复协议，以在发生故障时恢复经纪人。我们还提供分析模型，以估算我们方法的通信开销。我们在两个不同的MPSoC平台中验证了我们的方法。结果表明，我们的方法可以在系统中插入少量内存，同时在恢复过程中将系统停机时间降至最低。当应用程序有安全要求时，发布-订阅模型中的代理恢复策略就变得至关重要。在这项工作中，我们扩展了发布-订阅协议，以增加冗余给经纪人的敏感数据。此外，我们提供了一个恢复协议，以在发生故障时恢复经纪人。我们还提供分析模型，以估算我们方法的通信开销。我们在两个不同的MPSoC平台中验证了我们的方法。结果表明，我们的方法可以在系统中插入少量内存，同时在恢复过程中将系统停机时间降至最低。当应用程序有安全要求时，发布-订阅模型中的代理恢复策略就变得至关重要。在这项工作中，我们扩展了发布-订阅协议，以增加冗余给经纪人的敏感数据。此外，我们提供了一个恢复协议，以在发生故障时恢复经纪人。我们还提供分析模型，以估算我们方法的通信开销。我们在两个不同的MPSoC平台中验证了我们的方法。结果表明，我们的方法可以在系统中插入少量内存，同时在恢复过程中将系统停机时间降至最低。我们提供了一个恢复协议，以在发生故障时恢复经纪人。我们还提供分析模型，以估算我们方法的通信开销。我们在两个不同的MPSoC平台中验证了我们的方法。结果表明，我们的方法可以在系统中插入少量内存，同时在恢复过程中将系统停机时间降至最低。我们提供了一个恢复协议，以在发生故障时恢复代理。我们还提供分析模型，以估算我们方法的通信开销。我们在两个不同的MPSoC平台中验证了我们的方法。结果表明，我们的方法可以在系统中插入少量内存，同时在恢复过程中将系统停机时间降至最低。

更新日期：2020-03-26

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11