Plover: parallel logging for replication systems

Zhou, Huan; Guo, Jinwei; Hu, Huiqi; Qian, Weining; Zhou, Xuan; Zhou, Aoying

doi:10.1007/s11704-019-8314-y

Plover: parallel logging for replication systems

Research Article
Published: 03 January 2020

Volume 14, article number 144606, (2020)
Cite this article

Frontiers of Computer Science Aims and scope Submit manuscript

Huan Zhou¹,
Jinwei Guo¹,
Huiqi Hu¹,
Weining Qian¹,
Xuan Zhou¹ &
…
Aoying Zhou¹

70 Accesses
2 Citations
Explore all metrics

Abstract

Logging and replication are commonly used recovery approaches in database systems. To guarantee that the database state is not corrupted due to system crash, database systems rely on a centralized logging method to persist log entries into a stable storage device; to prevent data loss due to device failure, a primary server in the database system periodically replicates its state to backup servers by copying log entries over networks. As the transaction execution in a modern database system is highly parallelized, the centralized logging with a single I/O channel tends to inhibit the scalability of the system. Meanwhile, log entries generated at high speed make a network with limited bandwidth a potential bottleneck for replication.

In this paper, we propose an in-memory transaction engine named Plover with parallel logging and speedy replication for primary-backup replication systems. The parallel logging enables concurrent execution of logging by utilizing multiple log buffers associated with multiple stable storages. All log entries in the log buffers maintain a global sequence number (GSN), which ensures a partial order among transactions. The kernel of the speedy replication is an adaptive shipping method, which allows to transfer data increments instead of log entries to backups under heavy workloads. Experimental results using the YCSB and TPC-C benchmarks show that Plover scales well with the increasing number of worker threads and stable storage devices. And our adaptive shipping requires only one fifth network bandwidth of the conventional log shipping.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Low-Overhead Paxos Replication

Article Open access 22 March 2017

Scalable and adaptive log manager in distributed systems

Article 08 August 2022

Low Overhead Log Replication for Main Memory Database System

References

Mohan C, Haderle D J, Lindsay B G, Pirahesh H, Schwarz P M. ARIES: a transaction recovery method supporting fine-granularity locking and partial rollbacks using write-ahead logging. ACM Transaction Database System, 1992, 17(1): 94–162
Article Google Scholar
Elnikety S, Zwaenepoel W, Pedone F. Database replication using generalized snapshot isolation. In: Proceedings of the 24th Symposium on Reliable Distributed Systems. 2005, 73–84
Johnson R, Pandis I, Stoica R, Athanassoulis M, Ailamaki A. Aether: a scalable approach to logging. Proceedings of the VLDB Endowment, 2010, 3(1–2): 681–692
Article Google Scholar
Diaconu C, Freedman C, Ismert E, Larson P, Mittal P, Stonecipher R, Verma N, Zwilling M. Hekaton: SQL server’s memory-optimized OLTP engine. In: Proceedings of ACM International Conference on Management of Data. 2013, 1243–1254
Jung H, Han H, Kang S. Scalable database logging formulticores. Proceedings of the VLDB Endowment, 2017, 11(2): 135–148
Article Google Scholar
Tu S, Zheng W T, Kohler E, Liskov B, Madden S. Speedy transactions in multicore in-memory databases. In: Proceedings of the 24th ACM Symposium on Operating Systems Principles. 2013, 18–32
Wang T Z, Johnson R. Scalable logging through emerging non-volatile memory. Proceedings of the VLDB Endowment, 2014, 7(10): 865–876
Article Google Scholar
Guo J W, Chu J, Cai P, Zhou M Q, Zhou A Y. Low-overhead paxos replication. Data Science and Engineering, 2017, 2(2): 169–177
Article Google Scholar
Kim K, Wang T Z, Johnson R, Pandis I. ERMIA: fast memory-optimized database system for heterogeneous workloads. In: Proceedings of International Conference on Management of Data. 2016, 1675–1687
Qin D, Goel A, Brown A D. Scalable replay-based replication for fast databases. Proceedings of the VLDB Endowment, 2017, 10(13): 2025–2036
Article Google Scholar
Wang T Z, Johnson R, Pandis I. Query fresh: log shipping on steroids. Proceedings of the VLDB Endowment, 2017, 11(4): 406–419
Article Google Scholar
Hagmann R B. Reimplementing the cedar file system using logging and group commit. In: Proceedings of the 11th ACM Symposium on Operating System Principles. 1987, 155–162
Franklin M J. Concurrency control and recovery. The Computer Science and Engineering Handbook. 1997, 1058–1077
Gray J, Reuter A. Transaction Processing: Concepts and Techniques. Elsevier, 1993
Berenson H, Bernstein P A, Gray J, Melton J. A critique of ANSI SQL isolation levels. In: Proceedings of ACM SIGMOD International Conference on Management of Data. 1995, 1–10
Lamport L. Time, clocks, and the ordering of events in a distributed system. Communications of the ACM, 1978, 21(7): 558–565
Article Google Scholar
Larson P, Blanas S, Diaconu C, Freedman C, Patel J M, Zwilling M. High-performance concurrency control mechanisms for main-memory databases. Proceedings of the VLDB Endowment, 2011, 5(4): 298–309
Article Google Scholar
Wu Y J, Arulraj J, Lin J X, Xian R, Pavlo A. An empirical evaluation of in-memory multi-version concurrency control. Proceedings of the VLDB Endowment, 2017, 10(7): 781–792
Article Google Scholar
Weikum G, Vossen G. Transactional Information Systems: Theory, Algorithms, and the Practice of Concurrency Control and Recovery. Elsevier, 2001
Curino C, Zhang Y, Jones E P C, Madden S. Schism: a workload-driven approach to database replication and partitioning. Proceedings of the VLDB Endowment, 2010, 3(1): 48–57
Article Google Scholar
Andreev K, Räcke H. Balanced graph partitioning. Theory of Computing Systems, 2006, 39(6): 929–939
Article MathSciNet Google Scholar
Cooper B F, Silberstein A, Tam E, Ramakrishnan R, Sears R. Benchmarking cloud serving systems with YCSB. In: Proceedings of the 1st ACM Symposium on Cloud Computing. 2010, 143–154
Transaction Processing Performance Council. TPC benchmark c standard specification (revision 5.11), 2010
Malviya N, Weisberg A, Madden S, Stonebraker M. Rethinking main memory OLTP recovery. In: Proceedings of the 30th IEEE International Conference on Data Engineering. 2014, 604–615
Wu Y J, Guo W T, Chan C Y, Tan K T. Fast failure recovery for main-memory DBMSs on multicores. In: Proceedings of ACM International Conference on Management of Data. 2017, 267–281
Johnson R, Pandis I, Stoica R, Athanassoulis M, Ailamaki A. Scalability of write-ahead logging on multicore and multisocket hardware. The VLDB Journal-The International Journal on Very Large Data Bases, 2012, 21(2): 239–263
Article Google Scholar
Levandoski J, Lomet D B, Sengupta S, Stutsman R, Wang R. High performance transactions in deuteronomy. In: Proceedings of the Conference on Innovative Data Systems Research. 2015
Huang J, Schwan K, Qureshi M K. Nvram-aware logging in transaction systems. Proceedings of the VLDB Endowment, 2014, 8(4): 389–400
Article Google Scholar
Arulraj J, Perron M, Pavlo A. Write-behind logging. Proceedings of the VLDB Endowment, 2016, 10(4): 337–348
Article Google Scholar
Zheng W T, Tu S, Kohler E, Liskov B. Fast databases with fast durability and recovery through multicore parallelism. In: Proceedings of the 11th Symposium on Operating Systems Design and Implementation. 2014, 465–477
Lim H, Kaminsky M, Andersen D G. Cicada: dependably fast multi-core in-memory transactions. In: Proceedings of ACM International Conference on Management of Data. 2017, 21–35
Hong C, Zhou D, Yang M, Kuo C, Zhang L T, Zhou L D. Kuafu: closing the parallelism gap in database replication. In: Proceedings of the 29th IEEE International Conference on Data Engineering. 2013, 1186–1195

Download references

Acknowledgements

This work was partially supported by National Key R&D Program of China (2018YFB1003303), the National Natural Science Foundation of China (Grant Nos. 61672232 and 61772202). Youth Foundation of National Science Foundation (61702189). Youth Science and Technology — “Yang Fan” Program of Shanghai (17YF1427800).

Author information

Authors and Affiliations

School of Data Science and Engineering, East China Normal University, Shanghai, 200062, China
Huan Zhou, Jinwei Guo, Huiqi Hu, Weining Qian, Xuan Zhou & Aoying Zhou

Authors

Huan Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Jinwei Guo
View author publications
You can also search for this author in PubMed Google Scholar
Huiqi Hu
View author publications
You can also search for this author in PubMed Google Scholar
Weining Qian
View author publications
You can also search for this author in PubMed Google Scholar
Xuan Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Aoying Zhou
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Huiqi Hu.

Additional information

Huan Zhou is a PhD candidate in the School of Data Science and Engineering, East China Normal University, China. She received her BS in Computer Science and Technology from Sichuan Normal University, China in 2013. Her research interests include transaction processing in database management systems and replication in distributed systems.

Jinwei Guo is a PhD candidate in School of Data Science and Engineering from East China Normal University (ECNU), China. He received his bachelor degree in Computer Science and Technology from Qufu Normal University, China in 2010, and his master degree from Guizhou University, China in 2014. His research interests include transaction processing in database management systems and high availability in distributed systems.

Huiqi Hu is currently a lecture in the School of Data Science and Engineering, East China Normal University, China. He received his PhD Degree in Tsinghua University, China. His research interests mainly include database system theory and implementation, query optimization.

Weining Qian is a professor and dean of the School of Data Science and Engineering, East China Normal University, China. He received his MS and PhD in computer science from Fudan University, China in 2001 and 2004, respectively. He is now serving as a standing committee member of Database Technology Committee of China Computer Federation, and committee member of ACM SIGMOD China Chapter. His research interests include scalable transaction processing, benchmarking big data systems, and management and analysis of massive datasets.

Xuan Zhou is a professor and a vice dean of the School of Data Science and Engineering, East China Normal University (ECNU), China. He obtained his BSc from Fudan University, China and his PhD from the National University of Singapore, both in Computer Science. Since his graduation in 2005, he had worked as a scientist at the L3S Research Centre, Germany and the CSIRO ICT Centre, Australia until the end of 2010. Before he joined ECNU in 2017, he spent six years in Renmin University of China, as an associate professor. Xuan’s research interests include database system and information retrieval.

Aoying Zhou, Vice President of East China Normal University, Founding Dean of School of Data Science and Engineering (DaSE), Professor. He got his master and bachelor degree in Computer Science from Sichuan University, China in 1988 and 1985 respectively, and he won his PhD from Fudan University, China in 1993. He is the winner of the National Science Fund for Distinguished Young Scholars supported by the National Natural Science Foundation of China (NSFC) and the professorship appointment under Changjiang Scholars Program of Ministry of Education (MoE). He is CCF (China Computer Federation) Fellow, the Vice Director of Database Technology Committee of CCF, and Associate Editor-in-Chief of Chinese Journal of Computer. He served General Chair of ER’2004, Vice PC Chair of ICDE’2009 and ICDE’2012, PC Co-chair of VLDB’2014. His research interests include Web data management, data management for data-intensive computing, in-memory cluster computing, distributed transaction processing, benchmarking for big data and performance.

Electronic supplementary material