当前位置: X-MOL 学术Big Data Res. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
WISE: Workload-Aware Partitioning for RDF Systems
Big Data Research ( IF 3.5 ) Pub Date : 2020-10-27 , DOI: 10.1016/j.bdr.2020.100161
Xintong Guo , Hong Gao , Zhaonian Zou

Masses of large-scale knowledge graphs on various domains have sprung up in recent years. They are no longer able to be managed on a single machine. The distributed RDF systems intervene in the scalability issue using partitioning techniques. However, most of these systems are unaware of query workload and employ static partitioning. As diverse and dynamic workloads keep emerging in the knowledge graph applications, they cannot consistently provide good performance.

To address the problem, we propose a workload-aware partitioning framework WISE, which could be deployed on any initial partitioning. It encodes the incoming SPARQL queries in a novel structure called query span and periodically examines the query span to identify the frequent query patterns. The triples of a frequent query pattern are moved to the same partition, aiming at improving the response time in the future. Our experiments on various RDF datasets and workloads indicate that WISE achieves dramatic communication reduction and considerable performance improvement over the baseline method. The migration overhead only accounts for a small portion of the total runtime.



中文翻译:

WISE:RDF系统的工作负载感知分区

近年来,各种领域的大规模知识图谱如雨后春笋般涌现。它们不再能够在单台计算机上进行管理。分布式RDF系统使用分区技术来干预可伸缩性问题。但是,这些系统大多数都不知道查询工作量,而是采用静态分区。随着知识图应用程序中不断出现各种动态工作负载,它们无法始终如一地提供良好的性能。

为了解决该问题,我们提出了一种工作负载感知的分区框架WISE,该框架可以部署在任何初始分区上。它以一种称为查询范围的新颖结构对传入的SPARQL查询进行编码,并定期检查查询范围以识别频繁的查询模式。频繁查询模式的三元组被移到同一分区,目的是缩短将来的响应时间。我们对各种RDF数据集和工作负载的实验表明,与基线方法相比,WISE大大减少了通信并显着提高了性能。迁移开销仅占总运行时间的一小部分。

更新日期:2020-11-04
down
wechat
bug