当前位置: X-MOL 学术Trans. GIS › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Simulating multi-exit evacuation using deep reinforcement learning
Transactions in GIS ( IF 2.568 ) Pub Date : 2021-03-11 , DOI: 10.1111/tgis.12738
Dong Xu 1, 2 , Xiao Huang 3 , Joseph Mango 1 , Xiang Li 1 , Zhenlong Li 2
Affiliation  

Conventional simulations on multi-exit indoor evacuation focus primarily on how to determine a reasonable exit based on numerous factors in a changing environment. Results commonly include some congested and other under-utilized exits, especially with large numbers of pedestrians. We propose a multi-exit evacuation simulation based on deep reinforcement learning (DRL), referred to as the MultiExit-DRL, which involves a deep neural network (DNN) framework to facilitate state-to-action mapping. The DNN framework applies Rainbow Deep Q-Network (DQN), a DRL algorithm that integrates several advanced DQN methods, to improve data utilization and algorithm stability and further divides the action space into eight isometric directions for possible pedestrian choices. We compare MultiExit-DRL with two conventional multi-exit evacuation simulation models in three separate scenarios: varying pedestrian distribution ratios; varying exit width ratios; and varying open schedules for an exit. The results show that MultiExit-DRL presents great learning efficiency while reducing the total number of evacuation frames in all designed experiments. In addition, the integration of DRL allows pedestrians to explore other potential exits and helps determine optimal directions, leading to a high efficiency of exit utilization.

中文翻译:

使用深度强化学习模拟多出口疏散

传统的多出口室内疏散模拟主要关注如何根据不断变化的环境中的众多因素确定合理的出口。结果通常包括一些拥挤和其他未充分利用的出口,尤其是有大量行人。我们提出了一种基于深度强化学习 (DRL) 的多出口疏散模拟,称为 MultiExit-DRL,它涉及深度神经网络 (DNN) 框架以促进状态到动作的映射。DNN 框架应用 Rainbow Deep Q-Network (DQN),这是一种集成了多种先进 DQN 方法的 DRL 算法,以提高数据利用率和算法稳定性,并进一步将动作空间划分为八个等距方向,供行人选择。我们在三个不同的场景中将 MultiExit-DRL 与两个传统的多出口疏散模拟模型进行比较:不同的行人分布率;不同的出口宽度比;以及不同的出口开放时间表。结果表明,在所有设计的实验中,MultiExit-DRL 具有很高的学习效率,同时减少了疏散框架的总数。此外,DRL 的集成允许行人探索其他潜在出口并帮助确定最佳方向,从而提高出口利用效率。
更新日期:2021-03-11
down
wechat
bug