当前位置: X-MOL 学术ACM Trans. Storage › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
TH-DPMS
ACM Transactions on Storage ( IF 1.7 ) Pub Date : 2020-10-02 , DOI: 10.1145/3412852
Jiwu Shu 1 , Youmin Chen 1 , Qing Wang 1 , Bohong Zhu 1 , Junru Li 1 , Youyou Lu 1
Affiliation  

The rapidly increasing data in recent years requires the datacenter infrastructure to store and process data with extremely high throughput and low latency. Fortunately, persistent memory (PM) and RDMA technologies bring new opportunities towards this goal. Both of them are capable of delivering more than 10 GB/s of bandwidth and sub-microsecond latency. However, our past experiences and recent studies show that it is non-trivial to build an efficient and distributed storage system with such new hardware. In this article, we design and implement TH-DPMS (<underline>T</underline>sing<underline>H</underline>ua <underline>D</underline>istributed <underline>P</underline>ersistent <underline>M</underline>emory <underline>S</underline>ystem) based on persistent memory and RDMA, which unifies the memory, file system, and key-value interface in a single system. TH-DPMS is designed based on a unified distributed persistent memory abstract, pDSM. pDSM acts as a generic layer to connect the PMs of different storage nodes via high-speed RDMA network and organizes them into a global shared address space. It provides the fundamental functionalities, including global address management, space management, fault tolerance, and crash consistency guarantees. Applications are enabled to access pDSM with a group of flexible and easy-to-use APIs by using either raw read/write interfaces or the transactional ones with ACID guarantees. Based on pDSM, we implement a distributed file system and a key-value store named pDFS and pDKVS, respectively. Together, they uphold TH-DPMS with high-performance, low-latency, and fault-tolerant data storage. We evaluate TH-DPMS with both micro-benchmarks and real-world memory-intensive workloads. Experimental results show that TH-DPMS is capable of delivering an aggregated bandwidth of 120 GB/s with 6 nodes. When processing memory-intensive workloads such as YCSB and Graph500, TH-DPMS improves the performance by one order of magnitude compared to existing systems and keeps consistent high efficiency when the workload size grows to multiple terabytes.

中文翻译:

TH-DPMS

近年来快速增长的数据要求数据中心基础设施以极高的吞吐量和低延迟来存储和处理数据。幸运的是,持久内存 (PM) 和 RDMA 技术为实现这一目标带来了新的机遇。它们都能够提供超过 10 GB/s 的带宽和亚微秒级的延迟。然而,我们过去的经验和最近的研究表明,用这样的新硬件构建一个高效的分布式存储系统并非易事。在本文中,我们设计并实现了TH-DPMS(<underline>T</underline>sing<underline>H</underline>ua <underline>D</underline>分布式<underline>P</underline>持久<u​​nderline >M</underline>emory <underline>S</underline>ystem) 基于持久内存和 RDMA,它将内存、文件系统和键值接口统一在一个系统中。TH-DPMS 是基于统一的分布式持久内存抽象 pDSM 设计的。pDSM 作为一个通用层,通过高速 RDMA 网络连接不同存储节点的 PM,并将它们组织成一个全局共享地址空间。它提供了基本功能,包括全局地址管理、空间管理、容错和崩溃一致性保证。通过使用原始读/写接口或具有 ACID 保证的事务接口,应用程序可以使用一组灵活且易于使用的 API 访问 pDSM。基于 pDSM,我们实现了一个分布式文件系统和一个键值存储,分别命名为 pDFS 和 pDKVS。他们共同支持 TH-DPMS 的高性能、低延迟、和容错数据存储。我们使用微基准和实际内存密集型工作负载来评估 TH-DPMS。实验结果表明,TH-DPMS 能够以 6 个节点提供 120 GB/s 的聚合带宽。在处理 YCSB 和 Graph500 等内存密集型工作负载时,TH-DPMS 与现有系统相比性能提升一个数量级,并在工作负载规模增长到数 TB 时保持一致的高效率。
更新日期:2020-10-02
down
wechat
bug