当前位置: X-MOL 学术Distrib. Parallel. Databases › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Dissecting GeoSparkSim: a scalable microscopic road network traffic simulator in Apache Spark
Distributed and Parallel Databases ( IF 1.5 ) Pub Date : 2020-07-07 , DOI: 10.1007/s10619-020-07306-x
Jia Yu , Zishan Fu , Mohamed Sarwat

Researchers and practitioners have widely studied road network traffic data in different areas such as urban planning, traffic prediction and spatial-temporal databases. For instance, researchers use such data to evaluate the impact of road network changes. Unfortunately, collecting large-scale high-quality urban traffic data requires tremendous efforts because participating vehicles must install global positioning system(GPS) receivers and administrators must continuously monitor these devices. There have been some urban traffic simulators trying to generate such data with different features. However, they suffer from two critical issues (1) Scalability: most of them only offer single-machine solution which is not adequate to produce large-scale data. Some simulators can generate traffic in parallel but do not well balance the load among machines in a cluster. (2) Granularity: many simulators do not consider microscopic traffic situations including traffic lights, lane changing, car following. This paper proposed GeoSparkSim, a scalable traffic simulator which extends Apache Spark to generate large-scale road network traffic datasets with microscopic traffic simulation. The proposed system seamlessly integrates with a Spark-based spatial data management system, GeoSpark, to deliver a holistic approach that allows data scientists to simulate, analyze and visualize large-scale urban traffic data. To implement microscopic traffic models, GeoSparkSim employs a simulation-aware vehicle partitioning method to partition vehicles among different machines such that each machine has a balanced workload. The experimental analysis shows that GeoSparkSim can simulate the movements of 300 thousand vehicles over a very large road network (250 thousand road junctions and 300 thousand road segments) and outperform the existing competitors.

中文翻译:

剖析 GeoSparkSim:Apache Spark 中可扩展的微观道路网络交通模拟器

研究人员和从业者广泛研究了城市规划、交通预测和时空数据库等不同领域的路网交通数据。例如,研究人员使用此类数据来评估道路网络变化的影响。不幸的是,收集大规模高质量的城市交通数据需要付出巨大的努力,因为参与的车辆必须安装全球定位系统 (GPS) 接收器,并且管理员必须持续监控这些设备。已经有一些城市交通模拟器试图生成具有不同特征的此类数据。然而,它们存在两个关键问题(1)可扩展性:它们中的大多数只提供单机解决方案,不足以产生大规模数据。一些模拟器可以并行生成流量,但不能很好地平衡集群中机器之间的负载。(2) 粒度:很多模拟器没有考虑交通信号灯、变道、跟车等微观交通情况。本文提出了 GeoSparkSim,一种可扩展的交通模拟器,它扩展了 Apache Spark 以生成具有微观交通模拟的大规模道路网络交通数据集。拟议的系统与基于 Spark 的空间数据管理系统 GeoSpark 无缝集成,以提供一种整体方法,使数据科学家能够模拟、分析和可视化大规模城市交通数据。为了实现微观交通模型,GeoSparkSim 采用模拟感知车辆分区方法在不同机器之间划分车辆,使每台机器具有平衡的工作负载。
更新日期:2020-07-07
down
wechat
bug