当前位置: X-MOL 学术VLDB J. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
$$\varvec{\textsc {Orpheus}}$$ORPHEUS DB: bolt-on versioning for relational databases (extended version)
The VLDB Journal ( IF 2.8 ) Pub Date : 2019-12-20 , DOI: 10.1007/s00778-019-00594-5
Silu Huang , Liqi Xu , Jialin Liu , Aaron J. Elmore , Aditya Parameswaran

Data science teams often collaboratively analyze datasets, generating dataset versions at each stage of iterative exploration and analysis. There is a pressing need for a system that can support dataset versioning, enabling such teams to efficiently store, track, and query across dataset versions. We introduce OrpheusDB, a dataset version control system that “bolts on” versioning capabilities to a traditional relational database system, thereby gaining the analytics capabilities of the database “for free.” We develop and evaluate multiple data models for representing versioned data, as well as a lightweight partitioning scheme, LyreSplit, to further optimize the models for reduced query latencies. With LyreSplit, OrpheusDB is on average \(10^3\times \) faster in finding effective (and better) partitionings than competing approaches, while also reducing the latency of version retrieval by up to \(20\times \) relative to schemes without partitioning. LyreSplit can be applied in an online fashion as new versions are added, alongside an intelligent migration scheme that reduces migration time by \(10\times \) on average.

中文翻译:

$$ \ varvec {\ textsc {Orpheus}} $$ ORPHEUS DB:关系数据库的附加版本控制(扩展版本)

数据科学团队经常协作分析数据集,在迭代探索和分析的每个阶段生成数据集版本。迫切需要一种能够支持数据集版本控制的系统,以使此类团队能够有效地跨数据集版本存储,跟踪和查询。我们引入OrpheusDB,这是一个数据集版本控制系统,可将“版本控制”功能“添加”到传统的关系数据库系统中,从而“免费”获得数据库的分析功能。我们开发并评估用于表示版本化数据的多个数据模型,以及轻量级的分区方案LyreSplit,以进一步优化模型以减少查询延迟。使用LyreSplitOrpheusDB在寻找有效(且更好)分区方面,与竞争方法相比,平均速度要快(10 ^ 3 \ times),同时相对于没有分区的方案,版本检索的等待时间最多可减少\(20 \ times \)。随着新版本的添加,LyreSplit可以以在线方式应用,此外还有智能迁移方案,该方案平均将迁移时间减少了\(10 \ times \)
更新日期:2019-12-20
down
wechat
bug