Holistic evaluation in multi-model databases benchmarking,Distributed and Parallel Databases

当前位置： X-MOL 学术 › Distrib. Parallel. Databases › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Holistic evaluation in multi-model databases benchmarking
Distributed and Parallel Databases ( IF 1.2 ) Pub Date : 2019-12-06 , DOI: 10.1007/s10619-019-07279-6
Chao Zhang , Jiaheng Lu

A multi-model database (MMDB) is designed to support multiple data models against a single, integrated back-end. Examples of data models include document, graph, relational, and key-value. As more and more platforms are developed to deal with multi-model data, it has become crucial to establish a benchmark for evaluating the performance and usability of MMDBs. In this paper, we propose UniBench, a generic multi-model benchmark for a holistic evaluation of state-of-the-art MMDBs. UniBench consists of a set of mixed data models that mimics a social commerce application, which covers data models including JSON, XML, key-value, tabular, and graph. We propose a three-phase framework to simulate the real-life distributions and develop a multi-model data generator to produce the benchmarking data. Furthermore, in order to generate a comprehensive and unbiased query set, we develop an efficient algorithm to solve a new problem called multi-model parameter curation to judiciously control the query selectivity on diverse models. Finally, the extensive experiments based on the proposed benchmark were performed on four representatives of MMDBs: ArangoDB, OrientDB, AgensGraph and Spark SQL. We provide a comprehensive analysis with respect to internal data representations, multi-model query and transaction processing, and performance results for distributed execution.

中文翻译：

多模型数据库基准测试中的整体评估

多模型数据库 (MMDB) 旨在支持针对单个集成后端的多个数据模型。数据模型的示例包括文档、图形、关系和键值。随着越来越多的平台被开发来处理多模型数据，建立一个评估 MMDB 性能和可用性的基准变得至关重要。在本文中，我们提出了 UniBench，这是一种通用的多模型基准，用于对最先进的 MMDB 进行整体评估。UniBench 由一组模拟社交商务应用程序的混合数据模型组成，其中涵盖的数据模型包括 JSON、XML、键值、表格和图形。我们提出了一个三阶段框架来模拟现实生活中的分布，并开发一个多模型数据生成器来生成基准数据。此外，为了生成全面且无偏的查询集，我们开发了一种有效的算法来解决称为多模型参数管理的新问题，以明智地控制对不同模型的查询选择性。最后，在 MMDB 的四个代表：ArangoDB、OrientDB、AgensGraph 和 Spark SQL 上进行了基于所提出的基准测试的广泛实验。我们对内部数据表示、多模型查询和事务处理以及分布式执行的性能结果进行了全面分析。ArangoDB、OrientDB、AgensGraph 和 Spark SQL。我们对内部数据表示、多模型查询和事务处理以及分布式执行的性能结果进行了全面分析。ArangoDB、OrientDB、AgensGraph 和 Spark SQL。我们对内部数据表示、多模型查询和事务处理以及分布式执行的性能结果进行了全面分析。

更新日期：2019-12-06

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>