当前位置: X-MOL 学术Softw. Pract. Exp. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Improving LSM‐trie performance by parallel search
Software: Practice and Experience ( IF 2.6 ) Pub Date : 2020-08-03 , DOI: 10.1002/spe.2875
Wen Cheng 1 , Tao Guo 1 , Lingfang Zeng 1 , Yang Wang 2 , Lars Nagel 3 , Tim Süß 4 , André Brinkmann 4
Affiliation  

LSM‐trie‐based key‐value (KV) store is often used to manage an ultralarge dataset in reality by introducing a number of sublevels at each level, its linear growth pattern can fairly reduce the write amplification in store operations. Although this design is effective for the write operation, the last level holds a large proportion of KV items, leading to the extreme imbalance of data distribution. Therefore, to support efficient read, we need to carefully consider this imbalance. On the other hand, to ensure that acquired data is latest, the LSM‐trie needs to search the dataset at different levels one by one, and this search method may take a lot of unnecessary time. When the number of items is ultralarge, the random lookup performance may be poor due to the imbalance data distribution. To address this issue, we improve the read performance of the LSM‐trie by changing its serial search to parallel search, using two threads to simultaneously search at the last level and other levels, respectively. Our experiment results show that the read performance of the LSM‐trie can be improved up to 98.35% and on average 71.55%.

中文翻译:

通过并行搜索提高 LSM-trie 性能

基于 LSM-trie 的键值 (KV) 存储通常用于通过在每个级别引入多个子级别来管理现实中的超大数据集,其线性增长模式可以相当大地减少存储操作中的写入放大。这种设计虽然对写操作有效,但最后一层持有较大比例的KV项,导致数据分布极度不平衡。因此,为了支持高效读取,我们需要仔细考虑这种不平衡。另一方面,为了确保获取的数据是最新的,LSM-trie 需要对不同级别的数据集进行逐个搜索,这种搜索方法可能会花费大量不必要的时间。当项目数量超大时,由于数据分布不平衡,随机查找性能可能会很差。为了解决这个问题,我们通过将 LSM-trie 的串行搜索更改为并行搜索来提高 LSM-trie 的读取性能,使用两个线程分别在最后一级和其他级别同时搜索。我们的实验结果表明,LSM-trie 的读取性能可以提高高达 98.35%,平均提高 71.55%。
更新日期:2020-08-03
down
wechat
bug