当前位置: X-MOL 学术arXiv.cs.DB › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
NeuroCard: One Cardinality Estimator for All Tables
arXiv - CS - Databases Pub Date : 2020-06-15 , DOI: arxiv-2006.08109
Zongheng Yang, Amog Kamsetty, Sifei Luan, Eric Liang, Yan Duan, Xi Chen, and Ion Stoica

Query optimizers rely on accurate cardinality estimates to produce good execution plans. Despite decades of research, existing cardinality estimators are inaccurate for complex queries, due to making lossy modeling assumptions and not capturing inter-table correlations. In this work, we show that it is possible to learn the correlations across all tables in a database without any independence assumptions. We present NeuroCard, a join cardinality estimator that builds a single neural density estimator over an entire database. Leveraging join sampling and modern deep autoregressive models, NeuroCard makes no inter-table or inter-column independence assumptions in its probabilistic modeling. NeuroCard achieves orders of magnitude higher accuracy than the best prior methods (a new state-of-the-art result of 8.5$\times$ maximum error on JOB-light), scales to dozens of tables, while being compact in space (several MBs) and efficient to construct or update (seconds to minutes).



查询优化器依靠准确的基数估计来生成良好的执行计划。尽管进行了数十年的研究,但由于做出有损建模假设且未捕获表间相关性,现有的基数估计器对于复杂查询并不准确。在这项工作中,我们表明可以在没有任何独立性假设的情况下学习数据库中所有表的相关性。我们提出了 NeuroCard,这是一种连接基数估计器,可在整个数据库上构建单个神经密度估计器。利用连接采样和现代深度自回归模型,NeuroCard 在其概率建模中不做表间或列间独立性假设。NeuroCard 实现了比最佳先前方法高几个数量级的准确度(JOB-light 上 8.5$\times$ 最大误差的最新最新结果),