当前位置: X-MOL 学术VLDB J. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Dragoon: a hybrid and efficient big trajectory management system for offline and online analytics
The VLDB Journal ( IF 2.8 ) Pub Date : 2021-02-03 , DOI: 10.1007/s00778-021-00652-x
Ziquan Fang , Lu Chen , Yunjun Gao , Lu Pan , Christian S. Jensen

With the explosive use of GPS-enabled devices, increasingly massive volumes of trajectory data capturing the movements of people and vehicles are becoming available, which is useful in many application areas, such as transportation, traffic management, and location-based services. As a result, many trajectory data management and analytic systems have emerged that target either offline or online settings. However, some applications call for both offline and online analyses. For example, in traffic management scenarios, offline analyses of historical trajectory data can be used for traffic planning purposes, while online analyses of streaming trajectories can be adopted for congestion monitoring purposes. Existing trajectory-based systems tend to perform offline and online trajectory analysis separately, which is inefficient. In this paper, we propose a hybrid and efficient framework, called Dragoon, based on Spark, to support both offline and online big trajectory management and analytics. The framework features a mutable resilient distributed dataset model, including RDD Share, RDD Update, and RDD Mirror, which enables hybrid storage of historical and streaming trajectories. It also contains a real-time partitioner capable of efficiently distributing trajectory data and supporting both offline and online analyses. Therefore, Dragoon provides a hybrid analysis pipeline. Support for several typical trajectory queries and mining tasks demonstrates the flexibility of Dragoon. An extensive experimental study using both real and synthetic trajectory datasets shows that Dragoon (1) has similar offline trajectory query performance with the state-of-the-art system UlTraMan; (2) decreases up to doubled storage overhead compared with UlTraMan during trajectory editing; (3) achieves at least 40% improvement of scalability compared with popular streaming processing frameworks (i.e., Flink and Spark Streaming); and (4) offers an average doubled performance improvement for online trajectory data analytics.



中文翻译:

Dragoon:一种用于脱机和在线分析的混合高效的大轨迹管理系统

随着具有GPS功能的设备的爆炸性使用,捕获人员和车辆运动的越来越多的轨迹数据变得可用,这在许多应用领域(例如运输,交通管理和基于位置的服务)中很有用。结果,出现了许多针对离线或在线设置的轨迹数据管理和分析系统。但是,某些应用程序要求进行离线和在线分析。例如,在交通管理场景中,历史轨迹数据的离线分析可用于交通规划目的,而流轨迹的在线分析可用于拥堵监控目的。现有的基于轨迹的系统趋向于分别执行离线和在线轨迹分析,这效率低下。在本文中,龙骑兵(基于Spark),以支持离线和在线大轨迹管理和分析。该框架具有可变的弹性分布式数据集模型,其中包括RDD共享,RDD更新和RDD镜像,从而可以混合存储历史轨迹和流轨迹。它还包含一个实时分区器,该分区器能够有效地分配轨迹数据并支持离线和在线分析。因此,Dragoon提供了一个混合分析管道。对几种典型轨迹查询和挖掘任务的支持证明了Dragoon的灵活性。使用真实和合成轨迹数据集进行的广泛实验研究表明,Dragoon(1)与最新的系统UlTraMan具有相似的离线轨迹查询性能;(2)在轨迹编辑过程中,与UlTraMan相比,最多可将存储开销减少一倍;(3)与流行的流处理框架(即Flink和Spark流)相比,可伸缩性至少提高了40%;(4)在线轨迹数据分析的性能平均提高了一倍。

更新日期:2021-02-03
down
wechat
bug