Skip to main content
Log in

Multi-shot distributed transaction commit

  • Published:
Distributed Computing Aims and scope Submit manuscript

Abstract

Atomic Commit Problem (ACP) is a single-shot agreement problem similar to consensus, meant to model the properties of transaction commit protocols in fault-prone distributed systems. We argue that ACP is too restrictive to capture the complexities of modern transactional data stores, where commit protocols are integrated with concurrency control, and their executions for different transactions are interdependent. As an alternative, we introduce Transaction Certification Service (TCS), a new formal problem that captures safety guarantees of multi-shot transaction commit protocols with integrated concurrency control. TCS is parameterized by a certification function that can be instantiated to support common isolation levels, such as serializability and snapshot isolation. We then derive a provably correct crash-resilient protocol for implementing TCS through successive refinement. Our protocol achieves a better time complexity than mainstream approaches that layer two-phase commit on top of Paxos-style replication.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. In practice, the client only needs to send the data relevant to the corresponding shard. We omit this optimisation to simplify notation.

References

  1. Berenson, H., Bernstein, P., Gray, J., Melton, J., O’Neil, E., O’Neil, P.: A critique of ANSI SQL isolation levels. In: Conference on Management of Data (SIGMOD) (1995)

  2. Bernstein, P.A., Hadzilacos, V., Goodman, N.: Concurrency Control and Recovery in Database Systems. Addison-Wesley Longman Publishing Co., Inc., Boston (1986)

    Google Scholar 

  3. Borr, A.J.: Transaction monitoring in ENCOMPASS: reliable distributed transaction processing. In: International Conference on Very Large Data Bases (VLDB) (1981)

  4. Bravo, M., Gotsman, A.: Reconfigurable atomic transaction commit. In: Symposium on Principles of Distributed Computing (PODC) (2019)

  5. Chandra, T.D., Hadzilacos, V., Toueg, S.: The weakest failure detector for solving consensus. J. ACM 43(4), 685–722 (1996)

    Article  MathSciNet  Google Scholar 

  6. Charron-Bost, B., Schiper, A.: Uniform consensus is harder than consensus. J. Algorithms 51(1), 15–37 (2004)

    Article  MathSciNet  Google Scholar 

  7. Coelho, P.R., Schiper, N., Pedone, F.: Fast atomic multicast. In: Conference on Dependable Systems and Networks (DSN) (2017)

  8. Corbett, J.C., Dean, J., Epstein, M., Fikes, A., Frost, C., Furman, J.J., Ghemawat, S., Gubarev, A., Heiser, C., Hochschild, P., Hsieh, W.C., Kanthak, S., Kogan, E., Li, H., Lloyd, A., Melnik, S., Mwaura, D., Nagle, D., Quinlan, S., Rao, R., Rolig, L., Saito, Y., Szymaniak, M., Taylor, C., Wang, R., Woodford, D.: Spanner: Google’s globally-distributed database. In: Symposium on Operating Systems Design and Implementation (OSDI) (2012)

  9. Défago, X., Schiper, A., Urbán, P.: Total order broadcast and multicast algorithms: taxonomy and survey. ACM Comput. Surv. 36(4), 372–421 (2004)

    Article  Google Scholar 

  10. Ding, B., Kot, L., Demers, A., Gehrke, J.: Centiman: elastic, high performance optimistic concurrency control by watermarking. In: Symposium on Cloud Computing (SoCC) (2015)

  11. Dragojević, A., Narayanan, D., Nightingale, E.B., Renzelmann, M., Shamis, A., Badam, A., Castro, M.: No compromises: distributed transactions with consistency, availability, and performance. In: Symposium on Operating Systems Principles (SOSP) (2015)

  12. Dwork, C., Lynch, N., Stockmeyer, L.: Consensus in the presence of partial synchrony. J. ACM 35(2), 288–323 (1988)

    Article  MathSciNet  Google Scholar 

  13. Dwork, C., Skeen, D.: The inherent cost of nonblocking commitment. In: Symposium on Principles of Distributed Computing (PODC) (1983)

  14. Filipovic, I., O’Hearn, P.W., Rinetzky, N., Yang, H.: Abstraction for concurrent objects. Theor. Comput. Sci. 411(51–52), 4379–4398 (2010)

    Article  MathSciNet  Google Scholar 

  15. Glendenning, L., Beschastnikh, I., Krishnamurthy, A., Anderson, T.: Scalable consistency in Scatter. In: Symposium on Operating Systems Principles (SOSP) (2011)

  16. Gotsman, A., Lefort, A., Chockler, G.V.: White-box atomic multicast. In: Conference on Dependable Systems and Networks (DSN) (2019)

  17. Gray, J.: Notes on data base operating systems. In: International Conference on Very Large Databases (1978)

  18. Gray, J.: The transaction concept: virtues and limitations (invited paper). In: International Conference on Very Large Data Bases (VLDB) (1981)

  19. Gray, J., Lamport, L.: Consensus on transaction commit. ACM Trans. Database Syst. 31(1), 133–160 (2006)

    Article  Google Scholar 

  20. Guerraoui, R.: Revisiting the relationship between non-blocking atomic commitment and consensus. In: Workshop on Distributed Algorithms (WDAG) (1995)

  21. Guerraoui, R., Larrea, M., Schiper, A.: Reducing the cost for non-blocking in atomic commitment. In: International Conference on Distributed Computing Systems (ICDCS) (1996)

  22. Guerraoui, R., Wang, J.: How fast can a distributed transaction commit? In: Symposium on Principles of Database Systems (PODS) (2017)

  23. Hadzilacos, V.: On the relationship between the atomic commitment and consensus problems. In: Asilomar Workshop on Fault-Tolerant Distributed Computing (1990)

  24. Herlihy, M.P., Wing, J.M.: Linearizability: a correctness condition for concurrent objects. ACM Trans. Program. Lang. Syst. 12(3), 463–492 (1990)

    Article  Google Scholar 

  25. Junqueira, F.P., Reed, B.C., Serafini, M.: Zab: high-performance broadcast for primary-backup systems. In: Conference on Dependable Systems and Networks (DSN) (2011)

  26. Keidar, I., Dolev, D.: Increasing the resilience of atomic commit at no additional cost. In: Symposium on Principles of Database Systems (PODS) (1995)

  27. Keidar, I., Rajsbaum, S.: A simple proof of the uniform consensus synchronous lower bound. Inf. Process. Lett. 85(1), 47–52 (2003)

    Article  MathSciNet  Google Scholar 

  28. Kokocinski, M., Kobus, T., Wojciechowski, P.T.: Make the leader work: executive deferred update replication. In: Symposium on Reliable Distributed Systems (SRDS) (2014)

  29. Kraska, T., Pang, G., Franklin, M.J., Madden, S., Fekete, A.: MDCC: multi-data center consistency. In: European Conference on Computer Systems (EuroSys) (2013)

  30. Lamport, L.: The part-time parliament. ACM Trans. Comput. Syst. 16(2), 133–169 (1998)

    Article  Google Scholar 

  31. Mahmoud, H., Nawab, F., Pucher, A., Agrawal, D., El Abbadi, A.: Low-latency multi-datacenter databases using replicated commit. Proc. VLDB Endow. 6(9), 661–672 (2013)

    Article  Google Scholar 

  32. Mahmoud, H.A., Arora, V., Nawab, F., Agrawal, D., El Abbadi, A.: Maat: effective and scalable coordination of distributed transactions in the cloud. Proc. VLDB Endow. 7(5), 329–340 (2014)

    Article  Google Scholar 

  33. Maiyya, S., Nawab, F., Agrawal, D., El Abbadi, A.: Unifying consensus and atomic commitment for effective cloud data management. Proc. VLDB Endow. 12(5), 611–623 (2019)

    Article  Google Scholar 

  34. Maiyya, S., Nawab, F., Agrawal, D., El Abbadi, A.: Private communication. 2020–2021

  35. Maiyya, S., Nawab, F., Agrawal, D., El Abbadi, A.: Errata for “Unifying consensus and atomic commitment for effective cloud data management”. Submitted to VLDB’21 (2021)

  36. Mu, S., Nelson, L., Lloyd, W., Li, J.: Consolidating concurrency control and consensus for commits under conflicts. In: Symposium on Operating Systems Design and Implementation (OSDI) (2016)

  37. Oki, B.M., Liskov, B.H.: Viewstamped replication: a new primary copy method to support highly-available distributed systems. In: Symposium on Principles of Distributed Computing (PODC) (1988)

  38. Pedone, F., Guerraoui, R., Schiper, A.: The database state machine approach. Distrib. Parallel Databases 14(1), 71–98 (2003)

  39. Peluso, S., Romano, P., Quaglia, F.: Score: a scalable one-copy serializable partial replication protocol. In: International Middleware Conference (Middleware) (2012)

  40. Peluso, S., Ruivo, P., Romano, P., Quaglia, F., Rodrigues, L.E.T.: GMU: genuine multiversion update-serializable partial data replication. IEEE Trans. Parallel Distrib. Syst. 27(10), 71–98 (2016)

    Article  Google Scholar 

  41. Ramarao, K.V.S.: Complexity of distributed commit protocols. Acta Inform. 26(6), 577–595 (1989)

    Article  MathSciNet  Google Scholar 

  42. Saeida Ardekani, M., Sutra, P., Shapiro, M.: G-DUR: a middleware for assembling, analyzing, and improving transactional protocols. In: International Middleware Conference (Middleware) (2014)

  43. Schiper, N., Sutra, P., Pedone, F.: P-store: genuine partial replication in wide area networks. In: Symposium on Reliable Distributed Systems (SRDS) (2010)

  44. Schneider, F.B.: Implementing fault-tolerant services using the state machine approach: a tutorial. ACM Comput. Surv. 22(4), 299–319 (1990)

    Article  Google Scholar 

  45. Sciascia, D., Pedone, F., Junqueira, F.: Scalable deferred update replication. In: Conference on Dependable Systems and Networks (DSN) (2012)

  46. Skeen, D.: Nonblocking commit protocols. In: Conference on Management of Data (SIGMOD) (1981)

  47. Sovran, Y., Power, R., Aguilera, M.K., Li, J.: Transactional storage for geo-replicated systems. In: Symposium on Operating Systems Principles (SOSP) (2011)

  48. Thomson, A., Diamond, T., Weng, S., Ren, K., Shao, P., Abadi, D.J.: Calvin: fast distributed transactions for partitioned database systems. In: Conference on Management of Data (SIGMOD) (2012)

  49. Weikum, G., Vossen, G.: Transactional Information Systems: Theory, Algorithms, and the Practice of Concurrency Control and Recovery. Morgan Kaufmann Publishers Inc., Burlington (2001)

    Google Scholar 

  50. Zhang, I., Sharma, N.K., Szekeres, A., Krishnamurthy, A., Ports, D.R.K.: Building consistent transactions with inconsistent replication. In: Symposium on Operating Systems Principles (SOSP) (2015)

  51. Zhang, I., Sharma, N.K., Szekeres, A., Krishnamurthy, A., Ports, D.R.K.: When is operation ordering required in replicated transactional storage? IEEE Data Eng. Bull. 39(1), 27–38 (2016)

    Google Scholar 

Download references

Acknowledgements

Alexey Gotsman was supported by a Starting Grant RACCOON from the European Research Council.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alexey Gotsman.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article is a revised and expanded version of a paper that received a Best Paper Award at the 32nd International Symposium on Distributed Computing (DISC)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chockler, G., Gotsman, A. Multi-shot distributed transaction commit. Distrib. Comput. 34, 301–318 (2021). https://doi.org/10.1007/s00446-021-00389-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00446-021-00389-4

Keywords

Navigation