当前位置: X-MOL 学术J. Braz. Comput. Soc. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Robust Cardinality: a novel approach for cardinality prediction in SQL queries
Journal of the Brazilian Computer Society Pub Date : 2021-09-01 , DOI: 10.1186/s13173-021-00115-9
Francisco D. B. S. Praciano 1 , Paulo R. P. Amora 1 , Italo C. Abreu 1 , Francisco L. F. Pereira 1 , Javam C. Machado 1
Affiliation  

Database Management Systems (DBMSs) use declarative language to execute queries to stored data. The DBMS defines how data will be processed and ultimately retrieved. Therefore, it must choose the best option from the different possibilities based on an estimation process. The optimization process uses estimated cardinalities to make optimization decisions, such as choosing predicate order. In this paper, we propose Robust Cardinality, an approach to calculate cardinality estimates of query operations to guide the execution engine of the DBMSs to choose the best possible form or at least avoid the worst one. By using machine learning, instead of the current histogram heuristics, it is possible to improve these estimates; hence, leading to more efficient query execution. We perform experimental tests using PostgreSQL, comparing both estimators and a modern technique proposed in the literature. With Robust Cardinality, a lower estimation error of a batch of queries was obtained and PostgreSQL executed these queries more efficiently than when using the default estimator. We observed a 3% reduction in execution time after reducing 4 times the query estimation error. From the results, it is possible to conclude that this new approach results in improvements in query processing in DBMSs, especially in the generation of cardinality estimates.

中文翻译:

稳健基数:一种在 SQL 查询中进行基数预测的新方法

数据库管理系统 (DBMS) 使用声明性语言来执行对存储数据的查询。DBMS 定义了如何处理和最终检索数据。因此,它必须根据估计过程从不同的可能性中选择最佳选项。优化过程使用估计的基数来做出优化决策,例如选择谓词顺序。在本文中,我们提出了 Robust Cardinality,一种计算查询操作的基数估计的方法,以指导 DBMS 的执行引擎选择可能的最佳形式或至少避免最坏的形式。通过使用机器学习,而不是当前的直方图启发式方法,可以改进这些估计;因此,导致更有效的查询执行。我们使用 PostgreSQL 进行实验测试,比较估计量和文献中提出的现代技术。使用 Robust Cardinality,获得了一批查询的较低估计误差,并且 PostgreSQL 比使用默认估计器更有效地执行这些查询。我们观察到在将查询估计错误减少 4 倍后,执行时间减少了 3%。从结果中可以得出结论,这种新方法改进了 DBMS 中的查询处理,尤其是在基数估计的生成方面。
更新日期:2021-09-01
down
wechat
bug