当前位置: X-MOL 学术arXiv.cs.DB › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Hierarchical Bitmap Indexing for Range and Membership Queries on Multidimensional Arrays
arXiv - CS - Databases Pub Date : 2021-08-31 , DOI: arxiv-2108.13735
Luboš KrčálCzech Technical University in Prague, Czech Republic, Shen-Shyang HoRowan University, Glassboro, NJ, USA, Jan HolubCzech Technical University in Prague, Czech Republic

Traditional indexing techniques commonly employed in da\-ta\-ba\-se systems perform poorly on multidimensional array scientific data. Bitmap indices are widely used in commercial databases for processing complex queries, due to their effective use of bit-wise operations and space-efficiency. However, bitmap indices apply natively to relational or linearized datasets, which is especially notable in binned or compressed indices. We propose a new method for multidimensional array indexing that overcomes the dimensionality-induced inefficiencies. The hierarchical indexing method is based on $n$-di\-men\-sional sparse trees for dimension partitioning, with bound number of individual, adaptively binned indices for attribute partitioning. This indexing performs well on range involving both dimensions and attributes, as it prunes the search space early, avoids reading entire index data, and does at most a single index traversal. Moreover, the indexing is easily extensible to membership queries. The indexing method was implemented on top of a state of the art bitmap indexing library Fastbit. We show that the hierarchical bitmap index outperforms conventional bitmap indexing built on auxiliary attribute for each dimension. Furthermore, the adaptive binning significantly reduces the amount of bins and therefore memory requirements.

中文翻译:

多维数组范围和成员查询的分层位图索引

da\-ta\-ba\-se 系统中常用的传统索引技术在多维数组科学数据上表现不佳。位图索引在商业数据库中被广泛用于处理复杂查询,因为它们有效地使用了按位操作和空间效率。但是,位图索引本机适用于关系或线性化数据集,这在分箱或压缩索引中尤其显着。我们提出了一种多维数组索引的新方法,可以克服维度引起的低效率。分层索引方法基于用于维度划分的 $n$-di\-men\-sional 稀疏树,具有用于属性划分的个体、自适应分箱索引的绑定数量。这种索引在涉及维度和属性的范围上表现良好,因为它及早修剪了搜索空间,避免读取整个索引数据,最多只执行一次索引遍历。此外,索引很容易扩展到成员资格查询。索引方法是在最先进的位图索引库 Fastbit 之上实现的。我们表明,分层位图索引优于基于每个维度的辅助属性的传统位图索引。此外,自适应分箱显着减少了分箱的数量,从而减少了内存需求。
更新日期:2021-09-01
down
wechat
bug