当前位置: X-MOL 学术IEEE Trans. Dependable Secure Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Recovery Algorithms for Paxos-Based State Machine Replication
IEEE Transactions on Dependable and Secure Computing ( IF 7.0 ) Pub Date : 2019-07-10 , DOI: 10.1109/tdsc.2019.2926723
Jan Zbigniew Konczak , Pawel T. Wojciechowski , Nuno Santos , Tomasz Zurkowski , Andre Schiper

In this article, we propose and evaluate three different state recovery algorithms aimed for Paxos—one of the most popular distributed agreement protocols. Paxos is commonly used to maintain consistency among state machine replicas despite of failures of processes. The first algorithm, that we call FullSS, originates from the original Paxos and requires that the system frequently uses stable storage during regular (non-faulty) execution. The other two state recovery algorithms, ViewSS and EpochSS, scarcely require access to stable storage, and the recovering process must do much less work to restore its lost state, and to catch up on the current state of the system. We thoroughly analyze and compare the behavior of the three algorithms during state recovery and also during regular, non-faulty system execution, under various workloads (e.g., causing the network or CPU saturation). The experimental results show that by using ViewSS and EpochSS, we can significantly improve process recovery with respect to the original Paxos, if only it can be assumed that at any time a majority of replicas are up running (excluding those replicas that are just recovering). Moreover, these algorithms do not impact the performance of Paxos during regular (non-faulty) operation. However, FullSS is the only choice out of the three, if the system must tolerate catastrophic failures.

中文翻译:

基于Paxos的状态机复制的恢复算法

在本文中,我们提出并评估了针对Paxos的三种不同的状态恢复算法-一种最流行的分布式协议协议。Paxos通常用于维护状态机副本之间的一致性,尽管进程失败。我们称为FullSS的第一个算法起源于原始Paxos,它要求系统在常规(无故障)执行期间频繁使用稳定的存储。其他两种状态恢复算法ViewSS和EpochSS几乎不需要访问稳定的存储,并且恢复过程必须完成少得多的工作以恢复其丢失的状态并追上系统的当前状态。我们会在各种工作负载(例如,导致网络或CPU饱和)。实验结果表明,通过使用ViewSS和EpochSS,如果可以假设在任何时候大多数副本都在运行(不包括那些仅在恢复中的副本),则相对于原始Paxos,我们可以显着提高过程恢复。 。而且,这些算法不会在常规(无故障)操作期间影响Paxos的性能。但是,如果系统必须容忍灾难性故障,则FullSS是三者中的唯一选择。这些算法不会在常规(无故障)操作期间影响Paxos的性能。但是,如果系统必须容忍灾难性故障,则FullSS是三者中的唯一选择。这些算法不会在常规(无故障)操作期间影响Paxos的性能。但是,如果系统必须容忍灾难性故障,则FullSS是三者中的唯一选择。
更新日期:2019-07-10
down
wechat
bug