当前位置: X-MOL 学术ACM Trans. Database Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Succinct Range Filters
ACM Transactions on Database Systems ( IF 2.2 ) Pub Date : 2020-06-22 , DOI: 10.1145/3375660
Huanchen Zhang 1 , Hyeontaek Lim 1 , Viktor Leis 2 , David G. Andersen 1 , Michael Kaminsky 3 , Kimberly Keeton 4 , Andrew Pavlo 1
Affiliation  

We present the Succinct Range Filter (SuRF), a fast and compact data structure for approximate membership tests. Unlike traditional Bloom filters, SuRF supports both single-key lookups and common range queries: open-range queries, closed-range queries, and range counts. SuRF is based on a new data structure called the Fast Succinct Trie (FST) that matches the point and range query performance of state-of-the-art order-preserving indexes, while consuming only 10 bits per trie node. The false-positive rates in SuRF for both point and range queries are tunable to satisfy different application needs. We evaluate SuRF in RocksDB as a replacement for its Bloom filters to reduce I/O by filtering requests before they access on-disk data structures. Our experiments on a 100-GB dataset show that replacing RocksDB’s Bloom filters with SuRFs speeds up open-seek (without upper-bound) and closed-seek (with upper-bound) queries by up to 1.5× and 5× with a modest cost on the worst-case (all-missing) point query throughput due to slightly higher false-positive rate.

中文翻译:

简洁范围过滤器

我们提出简洁范围过滤器(SuRF),一种用于近似隶属度测试的快速且紧凑的数据结构。与传统的 Bloom 过滤器不同,SuRF 支持单键查找和常见范围查询:开放范围查询、封闭范围查询和范围计数。SuRF 基于一种新的数据结构,称为快速简洁特里 (FST)它与最先进的保序索引的点和范围查询性能相匹配,同时每个 trie 节点仅消耗 10 位。SuRF 中针对点查询和范围查询的误报率是可调的,以满足不同的应用程序需求。我们评估 RocksDB 中的 SuRF 作为其 Bloom 过滤器的替代品,通过在请求访问磁盘数据结构之前过滤请求来减少 I/O。我们在 100-GB 数据集上的实验表明,用 SuRF 替换 RocksDB 的 Bloom 过滤器可将开放式搜索(无上限)和封闭式搜索(有上限)查询速度提高 1.5 倍和 5 倍,且成本适中由于误报率略高,在最坏情况(所有缺失)点查询吞吐量上。
更新日期:2020-06-22
down
wechat
bug