当前位置: X-MOL 学术arXiv.cs.DB › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Database Repairing with Soft Functional Dependencies
arXiv - CS - Databases Pub Date : 2020-09-29 , DOI: arxiv-2009.13821
Nofar Carmeli, Martin Grohe, Benny Kimelfeld, Ester Livshits, and Muhammad Tibi

A common interpretation of soft constraints penalizes the database for every violation of every constraint, where the penalty is the cost (weight) of the constraint. A computational challenge is that of finding an optimal subset: a collection of database tuples that minimizes the total penalty when each tuple has a cost of being excluded. When the constraints are strict (i.e., have an infinite cost), this subset is a "cardinality repair" of an inconsistent database; in soft interpretations, this subset corresponds to a "most probable world" of a probabilistic database, a "most likely intention" of a probabilistic unclean database, and so on. Within the class of functional dependencies, the complexity of finding a cardinality repair is thoroughly understood. Yet, very little is known about the complexity of this problem in the more general soft semantics. This paper makes a significant progress in this direction. In addition to general insights about the hardness and approximability of the problem, we present algorithms for two special cases: a single functional dependency, and a bipartite matching. The latter is the problem of finding an optimal "almost matching" of a bipartite graph where a penalty is paid for every lost edge and every violation of monogamy.

中文翻译:

具有软功能依赖关系的数据库修复

软约束的一种常见解释是惩罚数据库对每个约束的每次违反,其中惩罚是约束的成本(权重)。一个计算挑战是找到一个最佳子集:一个数据库元组的集合,当每个元组都有被排除的成本时,最小化总惩罚。当约束很严格(即具有无限成本)时,这个子集是一个不一致数据库的“基数修复”;在软解释中,这个子集对应于概率数据库的“最可能的世界”、概率不洁数据库的“最可能的意图”,等等。在函数依赖类中,彻底理解了找到基数修复的复杂性。然而,在更一般的软语义中,对这个问题的复杂性知之甚少。本文在这方面取得了重大进展。除了对问题的难度和近似性的一般见解外,我们还提出了两种特殊情况的算法:单一函数依赖和二分匹配。后者是寻找二部图的最佳“几乎匹配”的问题,其中对每一条丢失的边和每一次违反一夫一妻制都要支付惩罚。
更新日期:2020-09-30
down
wechat
bug