当前位置:
X-MOL 学术
›
arXiv.cs.DB
›
论文详情
Our official English website, www.x-mol.net, welcomes your
feedback! (Note: you will need to create a separate account there.)
Tempura: A General Cost Based Optimizer Framework for Incremental Data Processing (Extended Version)
arXiv - CS - Databases Pub Date : 2020-09-28 , DOI: arxiv-2009.13631 Zuozhi Wang, Kai Zeng, Botong Huang, Wei Chen, Xiaozong Cui, Bo Wang, Ji Liu, Liya Fan, Dachuan Qu, Zhenyu Hou, Tao Guan, Chen Li, Jingren Zhou
arXiv - CS - Databases Pub Date : 2020-09-28 , DOI: arxiv-2009.13631 Zuozhi Wang, Kai Zeng, Botong Huang, Wei Chen, Xiaozong Cui, Bo Wang, Ji Liu, Liya Fan, Dachuan Qu, Zhenyu Hou, Tao Guan, Chen Li, Jingren Zhou
Incremental processing is widely-adopted in many applications, ranging from
incremental view maintenance, stream computing, to recently emerging
progressive data warehouse and intermittent query processing. Despite many
algorithms developed on this topic, none of them can produce an incremental
plan that always achieves the best performance, since the optimal plan is data
dependent. In this paper, we develop a novel cost-based optimizer framework,
called Tempura, for optimizing incremental data processing. We propose an
incremental query planning model called TIP based on the concept of
time-varying relations, which can formally model incremental processing in its
most general form. We give a full specification of Tempura, which can not only
unify various existing techniques to generate an optimal incremental plan, but
also allow the developer to add their rewrite rules. We study how to explore
the plan space and search for an optimal incremental plan. We conduct a
thorough experimental evaluation of Tempura in various incremental processing
scenarios to show its effectiveness and efficiency.
中文翻译:
天妇罗:用于增量数据处理的通用基于成本的优化器框架(扩展版)
增量处理在许多应用中被广泛采用,从增量视图维护、流计算到最近出现的渐进式数据仓库和间歇性查询处理。尽管针对此主题开发了许多算法,但它们都无法生成始终实现最佳性能的增量计划,因为最佳计划取决于数据。在本文中,我们开发了一种新的基于成本的优化器框架,称为 Tempura,用于优化增量数据处理。我们基于时变关系的概念提出了一种称为 TIP 的增量查询规划模型,它可以以最一般的形式对增量处理进行形式化建模。我们给出了天妇罗的完整规范,它不仅可以统一现有的各种技术来生成最优增量计划,但也允许开发人员添加他们的重写规则。我们研究如何探索计划空间并搜索最佳增量计划。我们在各种增量处理场景中对天妇罗进行了彻底的实验评估,以展示其有效性和效率。
更新日期:2020-09-30
中文翻译:
天妇罗:用于增量数据处理的通用基于成本的优化器框架(扩展版)
增量处理在许多应用中被广泛采用,从增量视图维护、流计算到最近出现的渐进式数据仓库和间歇性查询处理。尽管针对此主题开发了许多算法,但它们都无法生成始终实现最佳性能的增量计划,因为最佳计划取决于数据。在本文中,我们开发了一种新的基于成本的优化器框架,称为 Tempura,用于优化增量数据处理。我们基于时变关系的概念提出了一种称为 TIP 的增量查询规划模型,它可以以最一般的形式对增量处理进行形式化建模。我们给出了天妇罗的完整规范,它不仅可以统一现有的各种技术来生成最优增量计划,但也允许开发人员添加他们的重写规则。我们研究如何探索计划空间并搜索最佳增量计划。我们在各种增量处理场景中对天妇罗进行了彻底的实验评估,以展示其有效性和效率。