Abstract
The emergence of non-volatile memory (NVM) has introduced new opportunities for performance optimizations in existing storage systems. To better utilize its byte-addressability and near-DRAM performance, NVM can be attached on the memory bus and accessed via load/store memory instructions rather than the conventional block interface. In this scenario, a cache line (usually 64 bytes) becomes the data transfer unit between volatile and non-volatile devices. However, the failure-atomicity of write on NVM is the memory bit width (usually 8 bytes). This mismatch between the data transfer unit and the atomicity unit may introduce write amplification and compromise data consistency of node-based data structures such as B+-trees. In this paper, we propose WOBTree, a Write-Optimized B+-Tree for NVM to address the mismatch problem without expensive logging. WOBTree minimizes the update granularity from a tree node to a much smaller subnode and carefully arranges the write operations in it to ensure crash consistency and reduce write amplification. Experimental results show that compared with previous persistent B+-tree solutions, WOBTree reduces the write amplification by up to 86× and improves write performance by up to 61× while maintaining similar search performance.
Similar content being viewed by others
References
Mittal S, Vetter J S. A survey of software techniques for using non-volatile memories for storage and main memory systems. IEEE Transactions on Parallel and Distributed Systems, 2015, 27(5): 1537–1550
Li D, Liao X, Jin H, Tang Y, Zhao G. Writeback throttling in a virtualized system with SCM. Frontiers of Computer Science, 2016, 10(1): 82–95
Barber R, Bendel P, Czech M, Draese O, Ho F, Hrle N, Idreos S, Kim M, Koeth O, Lee J, Li T T, Lohman G M, Morfonios K, Müller R, Murthy K, Pandis I, Qiao L, Raman V, Szabo S, Sidle R, Stolze K. Blink: not your father’s database! In: Proceedings of the 37th International Conference on Very Large Databases. 2011, 1–22
Diaconu C, Freedman C, Ismert E, Larson P, Mittal P, Stonecipher R, Verma N, Zwilling M. Hekaton: SQL server’s memory-optimized OLTP engine. In: Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data. 2013, 1243–1254
Plattner H. The impact of columnar in-memory databases on enterprise systems. Proceedings of the VLDB Endowment, 2014, 7(13): 1722–1729
Nishtala R, Fugal H, Grimm S, Kwiatkowski M, Lee H, Li H C, McElroy R, Paleczny M, Peek D, Saab P, Stafford D, Tung T, Venkataramani V. Scaling memcache at facebook. In: Proceedings of the 10th USENIX Conference on Networked Systems Design and Implementation. 2013, 385–398
Ousterhout J K, Agrawal P, Erickson D, Kozyrakis C, Leverich J, Mazières D, Mitra S, Narayanan A, Parulkar G M, Rosenblum M, Rumble S M, Stratmann E, Stutsman R. The case for ramclouds: scalable highperformance storage entirely in dram. ACM SIGOPS Operating Systems Review, 2010, 43(4): 92–105
Zaharia M, Chowdhury M, Das T, Dave A, Ma J, McCauly M, Franklin M J, Shenker S, Stoica I. Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing. In: Proceedings of the 9th USENIX Conference on Networked Systems Design and Implementation. 2012, 15–28
Chen S, Jin Q. Persistent B+-trees in non-volatile main memory. Proceedings of the VLDB Endowment, 2015, 8(7): 786–797
Condit J, Nightingale E B, Frost C, Ipek E, Lee B C, Burger D, Coetzee D. Better I/O through byte-addressable, persistent memory. In: Proceedings of the 22nd ACM Symposium on Operating Systems Principles. 2009, 133–146
Hwang D, Kim W, Won Y, Nam B. Endurable transient inconsistency in byte-addressable persistent B+-tree. In: Proceedings of the 16th USENIX Conference on File and Storage Technologies. 2018, 187–200
Götze P, Baumann S, Sattler K. An NVM-aware storage layout for analytical workloads. In: Proceedings of the 34th IEEE International Conference on Data Engineering Workshops. 2018, 110–115
Lee S K, Lim K H, Song H, Nam B, Noh S H. Wort: write optimal radix tree for persistent memory storage systems. In: Proceedings of the 15th USENIX Conference on File and Storage Technologies. 2017, 257–270
Danowitz A, Kelley K, Mao J, Stevenson J P, Horowitz M. CPU DB: recording microprocessor history. Communications of the ACM, 2012, 55(4): 55–63
Yang J, Wei Q, Chen C, Wang C, Yong K L, He B. Nv-tree: reducing consistency cost for nvm-based single level systems. In: Proceedings of the 13th USENIX Conference on File and Storage Technologies. 2015, 167–181
Oukid I, Lasperas J, Nica A, Willhalm T, Lehner W. Fptree: a hybrid scmdram persistent and concurrent b-tree for storage class memory. In: Proceedings of the 2016 International Conference on Management of Data. 2016, 371–386
Rixner S, Dally W J, Kapasi U J, Mattson P R, Owens J D. Memory access scheduling. In: Proceedings of the 27th International Symposium on Computer Architecture. 2000, 128–138
Wang Y, Kapritsos M, Ren Z, Mahajan P, Kirubanandam J, Alvisi L, Dahlin M. Robustness in the salus scalable block store. In: Proceedings of the 10th USENIX Conference on Networked Systems Design and Implementation. 2013, 357–370
Moraru I, Andersen D G, Kaminsky M, Tolia N, Ranganathan P, Binkert N L. Consistent, durable, and safe memory management for byte-addressable non volatile main memory. In: Proceedings of the 1st ACM SIGOPS Conference on Timely Results in Operating Systems. 2013, 1–17
Volos H, Magalhaes G, Cherkasova L, Li J. Quartz: a lightweight performance emulator for persistent memory software. In: Proceedings of the 16th Annual Middleware Conference. 2015, 37–49
Arulraj J, Pavlo A, Dulloor S. Let’s talk about storage & recovery methods for non-volatile memory database systems. In: Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data. 2015, 707–722
Kim W, Kim J, Baek W, Nam B, Won Y. NVWAL: exploiting nvram in write-ahead logging. ACM SIGOPS Operating Systems Review, 2016, 50(2): 385–398
Chen A. A review of emerging non-volatile memory (NVM) technologies and applications. Solid-state Electronics, 2016, 125(Nov): 25–38
Acknowledgements
This work was supported by the National Key Research and Development Program of China (2018YFB1004401), the National Natural Science Foundation of China for Young Scientists (Grant No. 61502392), and the General Program of the National Natural Science Foundation of China (61472323).
Author information
Authors and Affiliations
Corresponding author
Additional information
Haitao Wang is currently a PhD student at the School of Computer Science and Engineering, Northwestern Polytechnical University, China. His research interests include performance analysis and optimization of file systems, key-value stores, and cloud storage systems.
Zhanhuai Li is currently a professor and PhD supervisor at the School of Computer Science and Engineering, Northwestern Polytechnical University, China. His research interests include database theory, massive data storage, and cloud computing.
Xiao Zhang is currently an associate professor at the School of Computer Science and Engineering, Northwestern Polytechnical University, China. His research interests include computer networks, distributed file systems, and massive data storage.
Xiaonan Zhao is currently an associate professor at the School of Computer Science and Engineering, Northwestern Polytechnical University, China. Her research interests include storage systems, computer networks, and distributed file systems.
Song Jiang is currently an associate professor at the Department of Computer Science and Engineering, the University of Texas at Arlington, China. His research interests include data management systems, file and storage systems, and I/O performance in high-performance computing.
Electronic supplementary material
Rights and permissions
About this article
Cite this article
Wang, H., Li, Z., Zhang, X. et al. WOBTree: a write-optimized B+-tree for non-volatile memory. Front. Comput. Sci. 15, 155106 (2021). https://doi.org/10.1007/s11704-020-0228-1
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11704-020-0228-1