当前位置: X-MOL 学术arXiv.cs.DB › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A Lightweight Algorithm to Uncover Deep Relationships in Data Tables
arXiv - CS - Databases Pub Date : 2020-09-07 , DOI: arxiv-2009.03358
Jin Cao and Yibo Zhao and Linjun Zhang and Jason Li

Many data we collect today are in tabular form, with rows as records and columns as attributes associated with each record. Understanding the structural relationship in tabular data can greatly facilitate the data science process. Traditionally, much of this relational information is stored in table schema and maintained by its creators, usually domain experts. In this paper, we develop automated methods to uncover deep relationships in a single data table without expert or domain knowledge. Our method can decompose a data table into layers of smaller tables, revealing its deep structure. The key to our approach is a computationally lightweight forward addition algorithm that we developed to recursively extract the functional dependencies between table columns that are scalable to tables with many columns. With our solution, data scientists will be provided with automatically generated, data-driven insights when exploring new data sets.

中文翻译:

揭示数据表中深层关系的轻量级算法

我们今天收集的许多数据都是表格形式,行作为记录,列作为与每条记录相关联的属性。了解表格数据中的结构关系可以极大地促进数据科学过程。传统上,这种关系信息的大部分存储在表模式中,并由其创建者(通常是领域专家)维护。在本文中,我们开发了自动化方法来在没有专家或领域知识的情况下发现单个数据表中的深层关系。我们的方法可以将数据表分解为更小的表层,揭示其深层结构。我们方法的关键是我们开发的一种计算轻量级的前向加法算法,用于递归提取表列之间的函数依赖关系,这些列可扩展到具有多列的表。通过我们的解决方案,
更新日期:2020-09-09
down
wechat
bug