当前位置: X-MOL 学术IEEE Trans. Vis. Comput. Graph. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
RSATree: Distribution-Aware Data Representation of Large-Scale Tabular Datasets for Flexible Visual Query.
IEEE Transactions on Visualization and Computer Graphics ( IF 5.2 ) Pub Date : 2019-08-20 , DOI: 10.1109/tvcg.2019.2934800
Honghui Mei , Wei Chen , Yating Wei , Yuanzhe Hu , Shuyue Zhou , Bingru Lin , Ying Zhao , Jiazhi Xia

Analysts commonly investigate the data distributions derived from statistical aggregations of data that are represented by charts, such as histograms and binned scatterplots, to visualize and analyze a large-scale dataset. Aggregate queries are implicitly executed through such a process. Datasets are constantly extremely large; thus, the response time should be accelerated by calculating predefined data cubes. However, the queries are limited to the predefined binning schema of preprocessed data cubes. Such limitation hinders analysts' flexible adjustment of visual specifications to investigate the implicit patterns in the data effectively. Particularly, RSATree enables arbitrary queries and flexible binning strategies by leveraging three schemes, namely, an R-tree-based space partitioning scheme to catch the data distribution, a locality-sensitive hashing technique to achieve locality-preserving random access to data items, and a summed area table scheme to support interactive query of aggregated values with a linear computational complexity. This study presents and implements a web-based visual query system that supports visual specification, query, and exploration of large-scale tabular data with user-adjustable granularities. We demonstrate the efficiency and utility of our approach by performing various experiments on real-world datasets and analyzing time and space complexity.

中文翻译:

RSATree:大规模表格数据集的分发感知数据表示形式,用于灵活的可视化查询。

分析师通常会研究从统计数据汇总中得出的数据分布,这些统计汇总由图表(如直方图和合并的散点图)表示,以可视化和分析大型数据集。聚合查询是通过这样的过程隐式执行的。数据集总是非常大;因此,应通过计算预定义的数据立方体来加快响应时间。但是,查询仅限于预处理数据多维数据集的预定义合并方案。这种局限性阻碍了分析师灵活调整视觉规格以有效研究数据中的隐式模式。特别是,RSATree通过利用三种方案(即基于R树的空间分区方案来捕获数据分布)来实现任意查询和灵活的分箱策略,一种对位置敏感的散列技术,以实现对数据项的保留位置的随机访问;一种求和的面积表方案,以线性计算复杂性支持对聚合值的交互式查询。这项研究提出并实现了一个基于Web的视觉查询系统,该系统支持视觉规范,查询以及用户可调整的粒度的大规模表格数据的浏览。通过对现实数据集进行各种实验并分析时间和空间复杂性,我们证明了这种方法的效率和实用性。这项研究提出并实现了一个基于Web的视觉查询系统,该系统支持视觉规范,查询以及用户可调整的粒度的大规模表格数据的浏览。我们通过在现实世界的数据集上进行各种实验并分析时间和空间复杂性来证明我们的方法的效率和实用性。这项研究提出并实现了一个基于Web的视觉查询系统,该系统支持视觉规范,查询以及用户可调整粒度的大规模表格数据的浏览。我们通过在现实世界的数据集上进行各种实验并分析时间和空间复杂性来证明我们的方法的效率和实用性。
更新日期:2019-11-01
down
wechat
bug