Abstract
Logging and replication are commonly used recovery approaches in database systems. To guarantee that the database state is not corrupted due to system crash, database systems rely on a centralized logging method to persist log entries into a stable storage device; to prevent data loss due to device failure, a primary server in the database system periodically replicates its state to backup servers by copying log entries over networks. As the transaction execution in a modern database system is highly parallelized, the centralized logging with a single I/O channel tends to inhibit the scalability of the system. Meanwhile, log entries generated at high speed make a network with limited bandwidth a potential bottleneck for replication.
In this paper, we propose an in-memory transaction engine named Plover with parallel logging and speedy replication for primary-backup replication systems. The parallel logging enables concurrent execution of logging by utilizing multiple log buffers associated with multiple stable storages. All log entries in the log buffers maintain a global sequence number (GSN), which ensures a partial order among transactions. The kernel of the speedy replication is an adaptive shipping method, which allows to transfer data increments instead of log entries to backups under heavy workloads. Experimental results using the YCSB and TPC-C benchmarks show that Plover scales well with the increasing number of worker threads and stable storage devices. And our adaptive shipping requires only one fifth network bandwidth of the conventional log shipping.
Similar content being viewed by others
References
Mohan C, Haderle D J, Lindsay B G, Pirahesh H, Schwarz P M. ARIES: a transaction recovery method supporting fine-granularity locking and partial rollbacks using write-ahead logging. ACM Transaction Database System, 1992, 17(1): 94–162
Elnikety S, Zwaenepoel W, Pedone F. Database replication using generalized snapshot isolation. In: Proceedings of the 24th Symposium on Reliable Distributed Systems. 2005, 73–84
Johnson R, Pandis I, Stoica R, Athanassoulis M, Ailamaki A. Aether: a scalable approach to logging. Proceedings of the VLDB Endowment, 2010, 3(1–2): 681–692
Diaconu C, Freedman C, Ismert E, Larson P, Mittal P, Stonecipher R, Verma N, Zwilling M. Hekaton: SQL server’s memory-optimized OLTP engine. In: Proceedings of ACM International Conference on Management of Data. 2013, 1243–1254
Jung H, Han H, Kang S. Scalable database logging formulticores. Proceedings of the VLDB Endowment, 2017, 11(2): 135–148
Tu S, Zheng W T, Kohler E, Liskov B, Madden S. Speedy transactions in multicore in-memory databases. In: Proceedings of the 24th ACM Symposium on Operating Systems Principles. 2013, 18–32
Wang T Z, Johnson R. Scalable logging through emerging non-volatile memory. Proceedings of the VLDB Endowment, 2014, 7(10): 865–876
Guo J W, Chu J, Cai P, Zhou M Q, Zhou A Y. Low-overhead paxos replication. Data Science and Engineering, 2017, 2(2): 169–177
Kim K, Wang T Z, Johnson R, Pandis I. ERMIA: fast memory-optimized database system for heterogeneous workloads. In: Proceedings of International Conference on Management of Data. 2016, 1675–1687
Qin D, Goel A, Brown A D. Scalable replay-based replication for fast databases. Proceedings of the VLDB Endowment, 2017, 10(13): 2025–2036
Wang T Z, Johnson R, Pandis I. Query fresh: log shipping on steroids. Proceedings of the VLDB Endowment, 2017, 11(4): 406–419
Hagmann R B. Reimplementing the cedar file system using logging and group commit. In: Proceedings of the 11th ACM Symposium on Operating System Principles. 1987, 155–162
Franklin M J. Concurrency control and recovery. The Computer Science and Engineering Handbook. 1997, 1058–1077
Gray J, Reuter A. Transaction Processing: Concepts and Techniques. Elsevier, 1993
Berenson H, Bernstein P A, Gray J, Melton J. A critique of ANSI SQL isolation levels. In: Proceedings of ACM SIGMOD International Conference on Management of Data. 1995, 1–10
Lamport L. Time, clocks, and the ordering of events in a distributed system. Communications of the ACM, 1978, 21(7): 558–565
Larson P, Blanas S, Diaconu C, Freedman C, Patel J M, Zwilling M. High-performance concurrency control mechanisms for main-memory databases. Proceedings of the VLDB Endowment, 2011, 5(4): 298–309
Wu Y J, Arulraj J, Lin J X, Xian R, Pavlo A. An empirical evaluation of in-memory multi-version concurrency control. Proceedings of the VLDB Endowment, 2017, 10(7): 781–792
Weikum G, Vossen G. Transactional Information Systems: Theory, Algorithms, and the Practice of Concurrency Control and Recovery. Elsevier, 2001
Curino C, Zhang Y, Jones E P C, Madden S. Schism: a workload-driven approach to database replication and partitioning. Proceedings of the VLDB Endowment, 2010, 3(1): 48–57
Andreev K, Räcke H. Balanced graph partitioning. Theory of Computing Systems, 2006, 39(6): 929–939
Cooper B F, Silberstein A, Tam E, Ramakrishnan R, Sears R. Benchmarking cloud serving systems with YCSB. In: Proceedings of the 1st ACM Symposium on Cloud Computing. 2010, 143–154
Transaction Processing Performance Council. TPC benchmark c standard specification (revision 5.11), 2010
Malviya N, Weisberg A, Madden S, Stonebraker M. Rethinking main memory OLTP recovery. In: Proceedings of the 30th IEEE International Conference on Data Engineering. 2014, 604–615
Wu Y J, Guo W T, Chan C Y, Tan K T. Fast failure recovery for main-memory DBMSs on multicores. In: Proceedings of ACM International Conference on Management of Data. 2017, 267–281
Johnson R, Pandis I, Stoica R, Athanassoulis M, Ailamaki A. Scalability of write-ahead logging on multicore and multisocket hardware. The VLDB Journal-The International Journal on Very Large Data Bases, 2012, 21(2): 239–263
Levandoski J, Lomet D B, Sengupta S, Stutsman R, Wang R. High performance transactions in deuteronomy. In: Proceedings of the Conference on Innovative Data Systems Research. 2015
Huang J, Schwan K, Qureshi M K. Nvram-aware logging in transaction systems. Proceedings of the VLDB Endowment, 2014, 8(4): 389–400
Arulraj J, Perron M, Pavlo A. Write-behind logging. Proceedings of the VLDB Endowment, 2016, 10(4): 337–348
Zheng W T, Tu S, Kohler E, Liskov B. Fast databases with fast durability and recovery through multicore parallelism. In: Proceedings of the 11th Symposium on Operating Systems Design and Implementation. 2014, 465–477
Lim H, Kaminsky M, Andersen D G. Cicada: dependably fast multi-core in-memory transactions. In: Proceedings of ACM International Conference on Management of Data. 2017, 21–35
Hong C, Zhou D, Yang M, Kuo C, Zhang L T, Zhou L D. Kuafu: closing the parallelism gap in database replication. In: Proceedings of the 29th IEEE International Conference on Data Engineering. 2013, 1186–1195
Acknowledgements
This work was partially supported by National Key R&D Program of China (2018YFB1003303), the National Natural Science Foundation of China (Grant Nos. 61672232 and 61772202). Youth Foundation of National Science Foundation (61702189). Youth Science and Technology — “Yang Fan” Program of Shanghai (17YF1427800).
Author information
Authors and Affiliations
Corresponding author
Additional information
Huan Zhou is a PhD candidate in the School of Data Science and Engineering, East China Normal University, China. She received her BS in Computer Science and Technology from Sichuan Normal University, China in 2013. Her research interests include transaction processing in database management systems and replication in distributed systems.
Jinwei Guo is a PhD candidate in School of Data Science and Engineering from East China Normal University (ECNU), China. He received his bachelor degree in Computer Science and Technology from Qufu Normal University, China in 2010, and his master degree from Guizhou University, China in 2014. His research interests include transaction processing in database management systems and high availability in distributed systems.
Huiqi Hu is currently a lecture in the School of Data Science and Engineering, East China Normal University, China. He received his PhD Degree in Tsinghua University, China. His research interests mainly include database system theory and implementation, query optimization.
Weining Qian is a professor and dean of the School of Data Science and Engineering, East China Normal University, China. He received his MS and PhD in computer science from Fudan University, China in 2001 and 2004, respectively. He is now serving as a standing committee member of Database Technology Committee of China Computer Federation, and committee member of ACM SIGMOD China Chapter. His research interests include scalable transaction processing, benchmarking big data systems, and management and analysis of massive datasets.
Xuan Zhou is a professor and a vice dean of the School of Data Science and Engineering, East China Normal University (ECNU), China. He obtained his BSc from Fudan University, China and his PhD from the National University of Singapore, both in Computer Science. Since his graduation in 2005, he had worked as a scientist at the L3S Research Centre, Germany and the CSIRO ICT Centre, Australia until the end of 2010. Before he joined ECNU in 2017, he spent six years in Renmin University of China, as an associate professor. Xuan’s research interests include database system and information retrieval.
Aoying Zhou, Vice President of East China Normal University, Founding Dean of School of Data Science and Engineering (DaSE), Professor. He got his master and bachelor degree in Computer Science from Sichuan University, China in 1988 and 1985 respectively, and he won his PhD from Fudan University, China in 1993. He is the winner of the National Science Fund for Distinguished Young Scholars supported by the National Natural Science Foundation of China (NSFC) and the professorship appointment under Changjiang Scholars Program of Ministry of Education (MoE). He is CCF (China Computer Federation) Fellow, the Vice Director of Database Technology Committee of CCF, and Associate Editor-in-Chief of Chinese Journal of Computer. He served General Chair of ER’2004, Vice PC Chair of ICDE’2009 and ICDE’2012, PC Co-chair of VLDB’2014. His research interests include Web data management, data management for data-intensive computing, in-memory cluster computing, distributed transaction processing, benchmarking for big data and performance.
Electronic supplementary material
Rights and permissions
About this article
Cite this article
Zhou, H., Guo, J., Hu, H. et al. Plover: parallel logging for replication systems. Front. Comput. Sci. 14, 144606 (2020). https://doi.org/10.1007/s11704-019-8314-y
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11704-019-8314-y