Technical Perspective,ACM SIGMOD Record

当前位置： X-MOL 学术 › ACM SIGMOD Rec. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Technical Perspective
ACM SIGMOD Record ( IF 0.9 ) Pub Date : 2020-09-04 , DOI: 10.1145/3422648.3422660
Reinhard Pichler ₁

Affiliation

Traditionally, by query answering we mean the problem of finding all answers to a given query over a given database. But what happens if the number of answers is prohibitively big - which may easily occur in a Big Data context? In such situations, it seems preferable to have a mechanism that produces one answer after the other with certain guarantees on the time between any two outputs and to let the user decide when to stop. This leads us to the enumeration problem, which has received a lot of interest recently [1]. However, in order for the user to get a "realistic" picture of the entirety of answers, two crucial questions arise: first, how big is the portion of output answers compared with the total number of answers? And second, do the output answers reflect the variety of the complete set of answers? The first question refers to the counting problem, where we are interested in the total number of answers. The second question leads us to the problem of uniform generation, where we request that the answers be uniformly generated and thus form an unbiased sample of the complete set of answers.

中文翻译：

技术视角

传统上，查询回答是指在给定数据库上找到给定查询的所有答案的问题。但是，如果答案的数量大得令人望而却步——这在大数据环境中很容易发生，会发生什么？在这种情况下，似乎最好有一种机制，一个接一个地产生一个答案，并保证任何两个输出之间的时间，并让用户决定何时停止。这将我们引向了枚举问题，该问题最近引起了很多兴趣[1]。然而，为了让用户对整个答案有一个“真实的”画面，出现了两个关键问题：第一，与答案总数相比，输出答案的比例有多大？第二，输出答案是否反映了整套答案的多样性？第一个问题是指计数问题，我们对答案的总数感兴趣。第二个问题将我们引向统一生成问题，我们要求统一生成答案，从而形成完整答案集的无偏样本。

更新日期：2020-09-04

点击分享查看原文

点击收藏

阅读更多本刊最新论文