Strategies for array data retrieval from a relational back-end based on access patterns,Computing

当前位置： X-MOL 学术 › Computing › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Strategies for array data retrieval from a relational back-end based on access patterns
Computing ( IF 3.3 ) Pub Date : 2020-03-30 , DOI: 10.1007/s00607-020-00804-x
Andrej Andrejev , Kjell Orsborn , Tore Risch

Multidimensional numeric arrays are often serialized to binary formats for efficient storage and processing. These representations can be stored as binary objects in existing relational database management systems. To minimize data transfer overhead when arrays are large and only parts of arrays are accessed, it is favorable to split these arrays into separately stored chunks. We process queries expressed in an extended graph query language SPARQL, treating arrays as node values and having syntax for specifying array projection, element and range selection operations as part of a query. When a query selects parts of one or more arrays, only the relevant chunks of each array should be retrieved from the relational database. The retrieval is made by automatically generated SQL queries. We evaluate different strategies for partitioning the array content, and for generating the SQL queries that retrieve it on demand. For this purpose, we present a mini-benchmark, featuring a number of typical array access patterns. We draw some actionable conclusions from the performance numbers.

中文翻译：

基于访问模式从关系后端检索数组数据的策略

多维数值数组通常被序列化为二进制格式以进行高效存储和处理。这些表示可以作为二进制对象存储在现有的关系数据库管理系统中。当数组很大并且只访问数组的一部分时，为了最小化数据传输开销，最好将这些数组拆分为单独存储的块。我们处理以扩展图查询语言 SPARQL 表示的查询，将数组视为节点值，并具有将数组投影、元素和范围选择操作指定为查询一部分的语法。当查询选择一个或多个数组的部分时，只应从关系数据库中检索每个数组的相关块。检索是通过自动生成的 SQL 查询进行的。我们评估了对数组内容进行分区以及生成按需检索它的 SQL 查询的不同策略。为此，我们提出了一个小型基准测试，其中包含许多典型的阵列访问模式。我们从性能数据中得出一些可操作的结论。

更新日期：2020-03-30

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11