当前位置: X-MOL 学术Int. J. Coop. Inf. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
MVPP-Based Materialized View Selection in Data Warehouses Using Simulated Annealing
International Journal of Cooperative Information Systems ( IF 0.5 ) Pub Date : 2020-06-18 , DOI: 10.1142/s021884302050001x
Mohsen Mohseni 1 , Mohammad Karim Sohrabi 1
Affiliation  

The process of extracting data from different heterogeneous data sources, transforming them into an integrated, unified and cleaned repository, and storing the result as a single entity leads to the construction of a data warehouse (DW), which facilitates access to data for the users of information systems and decision support systems. Due to their enormous volumes of data, processing of analytical queries of decision support systems need to scan very large amounts of data, which has a negative effect on the systems’ response time. Because of the special importance of online analytical processing (OLAP) in these systems, to enhance the performance and improve the query response time of the system, an appropriate number of views of the DW are selected for materialization and will be utilized for responding to the analytical queries, instead of direct access to the base relations. Memory constraint and views maintenance overhead are two main limitations that make it impossible, in most cases, to materialize all views of the DW. Selecting a proper set of views of DW for materialization, called materialized view selection (MVS) problem, is an important research issue that has been focused in various papers. In this paper, we have proposed a method, called P-SA, to select an appropriate set of views using an improved version of simulated annealing (SA) algorithm that utilizes a proper neighborhood selection strategy. P-SA uses the multiple view processing plan (MVPP) structure for selecting the views. Data and queries of a benchmark DW have been used in experimental results for evaluating the introduced method. The experimental results show better performance of the P-SA compared to other SA-based MVS methods for increasing the number of queries, in terms of the total cost of view maintenance and query processing. Moreover, the total cost of queries in the P-SA is also better than the other important SA-based MVS methods of the literature when the size of the DW is increased.

中文翻译:

使用模拟退火在数据仓库中基于 MVPP 的物化视图选择

从不同的异构数据源中提取数据,将它们转换为一个集成、统一和清洁的存储库,并将结果存储为单个实体的过程导致了数据仓库(DW)的构建,这有助于用户访问数据信息系统和决策支持系统。由于数据量巨大,决策支持系统的分析查询处理需要扫描非常大量的数据,这对系统的响应时间有负面影响。由于在线分析处理(OLAP)在这些系统中的特殊重要性,为了提高系统的性能和提高系统的查询响应时间,选择适当数量的 DW 视图进行物化,并将用于响应分析查询,而不是直接访问基本关系。内存约束和视图维护开销是在大多数情况下无法实现 DW 的所有视图的两个主要限制。选择一组合适的 DW 视图进行物化,称为物化视图选择 (MVS) 问题,是各种论文中一直关注的重要研究问题。在本文中,我们提出了一种称为 P-SA 的方法,该方法使用改进版本的模拟退火 (SA) 算法选择适当的视图集,该算法利用适当的邻域选择策略。P-SA 使用多视图处理计划 (MVPP) 结构来选择视图。基准 DW 的数据和查询已用于评估所介绍方法的实验结果。实验结果表明,在视图维护和查询处理的总成本方面,与其他基于 SA 的 MVS 方法相比,P-SA 在增加查询数量方面具有更好的性能。此外,当 DW 的大小增加时,P-SA 中查询的总成本也优于文献中其他重要的基于 SA 的 MVS 方法。
更新日期:2020-06-18
down
wechat
bug