A Comparative Study of Consistent Snapshot Algorithms for Main-Memory Database Systems,IEEE Transactions on Knowledge and Data Engineering

当前位置： X-MOL 学术 › IEEE Trans. Knowl. Data. Eng. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

A Comparative Study of Consistent Snapshot Algorithms for Main-Memory Database Systems
IEEE Transactions on Knowledge and Data Engineering ( IF 8.9 ) Pub Date : 2021-02-01 , DOI: 10.1109/tkde.2019.2930987
Liang Li , Guoren Wang , Gang Wu , Ye Yuan , Lei Chen , Xiang Lian

In-memory databases (IMDBs) are gaining increasing popularity in big data applications, where clients commit updates intensively. Specifically, it is necessary for IMDBs to have efficient snapshot performance to support certain special applications (e.g., consistent checkpoint, HTAP). Formally, the in-memory consistent snapshot problem refers to taking an in-memory consistent time-in-point snapshot with the constraints that 1) clients can read the latest data items and 2) any data item in the snapshot should not be overwritten. Various snapshot algorithms have been proposed in academia to trade off throughput and latency, but industrial IMDBs such as Redis adhere to the simple fork algorithm. To understand this phenomenon, we conduct comprehensive performance evaluations on mainstream snapshot algorithms. Surprisingly, we observe that the simple fork algorithm indeed outperforms the state-of-the-arts in update-intensive workload scenarios. On this basis, we identify the drawbacks of existing research and propose two lightweight improvements. Extensive evaluations on synthetic data and Redis show that our lightweight improvements yield better performance than fork, the current industrial standard, and the representative snapshot algorithms from academia. Finally, we have opensourced the implementation of all the above snapshot algorithms so that practitioners are able to benchmark the performance of each algorithm and select proper methods for different application scenarios.

中文翻译：

主存数据库系统一致性快照算法的比较研究

内存数据库 (IMDB) 在大数据应用程序中越来越受欢迎，客户端密集地提交更新。具体来说，IMDB 需要具备高效的快照性能来支持某些特殊应用（例如，一致性检查点、HTAP）。形式上，内存一致性快照问题是指在1）客户端可以读取最新数据项和2）快照中的任何数据项不应被覆盖的约束下拍摄内存中一致性时间点快照。学术界提出了各种快照算法来权衡吞吐量和延迟，但工业 IMDB（如 Redis）坚持使用简单的 fork 算法。为了理解这一现象，我们对主流快照算法进行了综合性能评估。出奇，我们观察到，在更新密集型工作负载场景中，简单的分叉算法确实优于最先进的算法。在此基础上，我们找出现有研究的不足，并提出两项轻量级改进。对合成数据和 Redis 的广泛评估表明，我们的轻量级改进产生了比 fork、当前工业标准和学术界代表性快照算法更好的性能。最后，我们开源了上述所有快照算法的实现，以便从业者能够对每种算法的性能进行基准测试，并针对不同的应用场景选择合适的方法。我们确定了现有研究的缺点，并提出了两个轻量级的改进。对合成数据和 Redis 的广泛评估表明，我们的轻量级改进产生了比 fork、当前工业标准和学术界代表性快照算法更好的性能。最后，我们开源了上述所有快照算法的实现，以便从业者能够对每种算法的性能进行基准测试，并针对不同的应用场景选择合适的方法。我们确定了现有研究的缺点并提出了两个轻量级的改进。对合成数据和 Redis 的广泛评估表明，我们的轻量级改进产生了比 fork、当前工业标准和学术界代表性快照算法更好的性能。最后，我们开源了上述所有快照算法的实现，以便从业者能够对每种算法的性能进行基准测试，并针对不同的应用场景选择合适的方法。

更新日期：2021-02-01

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>