Abstract
Consensus is a fundamental abstraction in distributed systems and its solvability is widely discussed in the literature. In message passing distributed systems where there is a need to solve sequential instances of consensus, it is possible that some processes become faulty during one instance and recover later in another instance. Though consensus algorithms should be equipped both to handle process failures and process recovery, only a little amount of work has been done in the literature to handle process recovery. Handling process recovery is not trivial because a recovered process may broadcast a new message which could hamper the progress made by other processes towards achieving consensus in their current round, and thereby forcing them to start a new round. Therefore algorithms that are not designed to handle process recovery require \({\text {O}}\bigl (f\bigr )\) rounds or \({\text {O}}\bigl (f\delta \bigr )\) time to achieve consensus, where at most f processes can recover and \(\delta \) is the message delay in the system. But Dutta et al. (in: International conference on dependable systems and networks (DSN’05), pp 22–27), 2005. https://doi.org/10.1109/DSN.2005.54) showed that the overhead of handling process recovery is constant and their algorithm takes \(17\delta \) time to achieve consensus. In this work, we introduce a new Paxos based algorithm that lowers the upper bound to \(11\delta \). We also show that if all process failures are initial, the upper bound can be further reduced to \(5\delta \). Our algorithm selectively enables processes executing lower rounds to decide irrespective of the presence of higher rounds in the system, minimizing the effect of recovered processes starting a higher round.
Similar content being viewed by others
Notes
The R in R-good run and R-nice run stands for recoverable: these runs allow failed processes to recover and participate in consensus.
Messages broadcast from failed process introduce additional overhead while solving consensus because when another process receives their message, the latter would not know the former has failed and thus does not discard the message while processing the message and changing its state accordingly.
An Accept message contains a round number and an estimate.
The maximum value of t is f.
Including recovered processes after \(T_S\)—since they wait for \(\delta \) time before starting a round after recovery, they receive the dedicated message containing hr from other processes.
References
Ailijiang, A., Charapko, A., Demirbas, M.: Consensus in the cloud: Paxos systems demystified. In: IEEE 25th International Conference on Computer Communication and Networks (ICCCN), pp. 1–10 (2016)
Alagappan, R., Ganesan, A., Lee, E., Albarghouthi, A., Chidambaram, V., Arpaci-Dusseau, A.C., Arpaci-Dusseau, R.H.: Protocol-aware recovery for consensus-based storage. In: 16th \(\{\text{USENIX}\}\) Conference on File and Storage Technologies (\(\{\text{ FAST }\}\) 18), \(\{USENIX\} Association\), pp. 15–32 (2018)
Alistarh, D., Gilbert, S., Guerraoui, R., Travers, C.: How to Solve Consensus in the Smallest Window of Synchrony, pp. 32–46. Springer, Berlin. https://doi.org/10.1007/978-3-540-87779-0_3 (2008)
Batra, R.: Implementation and evaluation of paxos and raft distributed consensus protocols. PhD thesis (2017)
Chandra, T.D., Hadzilacos, V., Toueg, S.: The weakest failure detector for solving consensus. J. ACM (JACM) 43(4), 685–722 (1996)
Chandra, T.D., Griesemer, R., Redstone, J.: Paxos made live: an engineering perspective. In: Proceedings of the 26th Annual ACM Symposium on Principles of Distributed Computing, pp. 398–407. ACM (2007)
Dolev, D., Dwork, C., Stockmeyer, L.: On the minimal synchronism needed for distributed consensus. J. ACM (JACM) 34(1), 77–97 (1987)
Dutta, P., Guerraoui, R., Lamport, L.: How fast can eventual synchrony lead to consensus? In: International Conference on Dependable Systems and Networks (DSN’05), pp. 22–27. https://doi.org/10.1109/DSN.2005.54 (2005)
Dutta, P., Guerraoui, R., Keidar, I.: The overhead of consensus failure recovery. Distrib. Comput. 19(5–6), 373–386 (2007)
Dwork, C., Lynch, N., Stockmeyer, L.: Consensus in the presence of partial synchrony. J. ACM 35(2), 288–323 (1988). https://doi.org/10.1145/42282.42283
Fischer, M.J., Lynch, N.A., Paterson, M.S.: Impossibility of distributed consensus with one faulty process. J. ACM (JACM) 32(2), 374–382 (1985)
Keidar, I., Rajsbaum, S.: On the cost of fault-tolerant consensus when there are no faults: preliminary version. ACM SIGACT News 32(2), 45–63 (2001)
Lamport, L., et al.: Paxos made simple. ACM Sigact News 32(4), 18–25 (2001)
Lorch, J.R., Adya, A., Bolosky, W.J., Chaiken, R., Douceur, J.R., Howell, J.: The smart way to migrate replicated stateful services. ACM SIGOPS Oper. Syst. Rev. ACM 40, 103–115 (2006)
Lynch, N.A.: Distributed algorithms. Elsevier, Amsterdam (1996)
Moraru, I., Andersen, D.G., Kaminsky, M.: There is more consensus in egalitarian parliaments. In: Proceedings of the 24th ACM Symposium on Operating Systems Principles, pp. 358–372. ACM (2013)
Ongaro, D., Ousterhout, J.K.: In: search of an understandable consensus algorithm. In: USENIX Annual Technical Conference, pp. 305–319 (2014)
Pease, M., Shostak, R., Lamport, L.: Reaching agreement in the presence of faults. J. ACM (JACM) 27(2), 228–234 (1980)
Sutra, P., Shapiro, M.: Fast genuine generalized consensus. In: 30th IEEE Symposium on Reliable Distributed Systems (SRDS), pp. 255–264. IEEE (2011)
Van Renesse, R., Altinbuken, D.: Paxos made moderately complex. ACM Comput. Surv. (CSUR) 47(3), 42 (2015)
Van Renesse, R., Schiper, N., Schneider, F.B.: Vive la différence: Paxos vs. viewstamped replication vs. zab. IEEE Trans. Depend. Secur. Comput. 12(4), 472–484 (2015)
Yanhua, M.F., Junqueria, P., Marzullo, K.: Mencius: building efficient replicated state machines for WANs. In: Proceedings of the Symposium on Operating System Design and Implementation, pp. 369–384 (2008)
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Srinivasan, S., Kandukoori, R. A Paxos based algorithm to minimize the overhead of process recovery in consensus. Acta Informatica 56, 433–446 (2019). https://doi.org/10.1007/s00236-019-00334-w
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00236-019-00334-w