当前位置: X-MOL 学术Big Data Res. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Dependency Visualization in Data Stream Profiling
Big Data Research ( IF 3.3 ) Pub Date : 2021-06-24 , DOI: 10.1016/j.bdr.2021.100240
Bernardo Breve , Loredana Caruccio , Stefano Cirillo , Vincenzo Deufemia , Giuseppe Polese

Data stream profiling concerns the automatic extraction of metadata from a data stream, without having the possibility to store it. Among the metadata of interest, functional dependencies (fds), and their extensions relaxed functional dependencies (rfds), represent an important semantic property of data. Nowadays, there are many algorithms for automatically discovering them from static datasets, and some are being proposed for data streams. However, one of the main problems is that the stream nature of data requires a different paradigm of monitoring, since the “big” number of (r)fds that might hold on a given dataset continuously change as new data are read from the stream. In this paper, we present a tool for visualizing rfds discovered from a data stream. The tool permits to explore results for different types of rfds, and uses quantitative measures to monitor how discovery results evolve. Moreover, the tool enables the comparison among rfds discovered across several executions, also proving visual manipulation operators to dynamically compose and filter results. A user study has been conducted to assess the effectiveness of the proposed visualization tool.



中文翻译:

数据流分析中的依赖可视化

数据流分析涉及从数据流中自动提取元数据,而不能存储它。在感兴趣的元数据中,函数依赖(fd s)及其扩展松弛函数依赖(rfd s)代表了数据的重要语义属性。如今,有许多算法可以从静态数据集中自动发现它们,并且正在为数据流提出一些算法。然而,主要问题之一是数据的流性质需要不同的监控范式,因为随着从流中读取新数据,可能保存在给定数据集上的“大”( r ) fd数量不断变化. 在本文中,我们提出了一种可视化工具从数据流中发现的rfd。该工具允许探索不同类型rfd的结果,并使用定量措施来监控发现结果的演变。此外,该工具可以比较在多次执行中发现的rfd之间的比较,还证明了可视化操作操作符可以动态组合和过滤结果。已进行用户研究以评估所提议的可视化工具的有效性。

更新日期:2021-06-29
down
wechat
bug