当前位置:
X-MOL 学术
›
arXiv.cs.DB
›
论文详情
Our official English website, www.x-mol.net, welcomes your
feedback! (Note: you will need to create a separate account there.)
SEACOW: Synopsis Embedded Array Compression using Wavelet Transform
arXiv - CS - Databases Pub Date : 2021-09-16 , DOI: arxiv-2109.07699 Minsoo Kim, Hyubjin Lee, Yon Dohn Chung
arXiv - CS - Databases Pub Date : 2021-09-16 , DOI: arxiv-2109.07699 Minsoo Kim, Hyubjin Lee, Yon Dohn Chung
Recently, multidimensional data is produced in various domains; because a
large volume of this data is often used in complex analytical tasks, it must be
stored compactly and able to respond quickly to queries. Existing compression
schemes well reduce the data storage; however, they might increase overall
computational costs while performing queries. Effectively querying compressed
data requires a compression scheme carefully designed for the tasks. This study presents a novel compression scheme, SEACOW, for storing and
querying multidimensional array data. The scheme is based on wavelet transform
and utilizes a hierarchical relationship between sub-arrays in the transformed
data to compress the array. A result of the compression embeds a synopsis,
improving query processing performance while acting as an index. To perform
experiments, we implemented an array database, SEACOW storage, and evaluated
query processing performance on real data sets. Our experiments show that 1)
SEACOW provides a high compression ratio comparable to existing compression
schemes and 2) the synopsis improves analytical query processing performance.
中文翻译:
SEACOW:概要使用小波变换的嵌入式阵列压缩
最近,多维数据在各个领域产生;由于大量此类数据通常用于复杂的分析任务,因此必须紧凑地存储并能够快速响应查询。现有的压缩方案很好地减少了数据存储;但是,它们可能会在执行查询时增加整体计算成本。有效地查询压缩数据需要为任务精心设计的压缩方案。本研究提出了一种新的压缩方案 SEACOW,用于存储和查询多维数组数据。该方案基于小波变换,利用变换数据中子阵列之间的层次关系对阵列进行压缩。压缩的结果嵌入了概要,在充当索引的同时提高了查询处理性能。为了进行实验,我们实现了一个数组数据库、SEACOW 存储,并评估了对真实数据集的查询处理性能。我们的实验表明,1) SEACOW 提供了与现有压缩方案相当的高压缩率,2) 概要提高了分析查询处理性能。
更新日期:2021-09-17
中文翻译:
SEACOW:概要使用小波变换的嵌入式阵列压缩
最近,多维数据在各个领域产生;由于大量此类数据通常用于复杂的分析任务,因此必须紧凑地存储并能够快速响应查询。现有的压缩方案很好地减少了数据存储;但是,它们可能会在执行查询时增加整体计算成本。有效地查询压缩数据需要为任务精心设计的压缩方案。本研究提出了一种新的压缩方案 SEACOW,用于存储和查询多维数组数据。该方案基于小波变换,利用变换数据中子阵列之间的层次关系对阵列进行压缩。压缩的结果嵌入了概要,在充当索引的同时提高了查询处理性能。为了进行实验,我们实现了一个数组数据库、SEACOW 存储,并评估了对真实数据集的查询处理性能。我们的实验表明,1) SEACOW 提供了与现有压缩方案相当的高压缩率,2) 概要提高了分析查询处理性能。