Simulating multi-exit evacuation using deep reinforcement learning,arXiv - CS - Artificial Intelligence

当前位置： X-MOL 学术 › arXiv.cs.AI › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Simulating multi-exit evacuation using deep reinforcement learning
arXiv - CS - Artificial Intelligence Pub Date : 2020-07-11 , DOI: arxiv-2007.05783
Dong Xu, Xiao Huang, Joseph Mango, Xiang Li, Zhenlong Li

Conventional simulations on multi-exit indoor evacuation focus primarily on how to determine a reasonable exit based on numerous factors in a changing environment. Results commonly include some congested and other under-utilized exits, especially with massive pedestrians. We propose a multi-exit evacuation simulation based on Deep Reinforcement Learning (DRL), referred to as the MultiExit-DRL, which involves in a Deep Neural Network (DNN) framework to facilitate state-to-action mapping. The DNN framework applies Rainbow Deep Q-Network (DQN), a DRL algorithm that integrates several advanced DQN methods, to improve data utilization and algorithm stability, and further divides the action space into eight isometric directions for possible pedestrian choices. We compare MultiExit-DRL with two conventional multi-exit evacuation simulation models in three separate scenarios: 1) varying pedestrian distribution ratios, 2) varying exit width ratios, and 3) varying open schedules for an exit. The results show that MultiExit-DRL presents great learning efficiency while reducing the total number of evacuation frames in all designed experiments. In addition, the integration of DRL allows pedestrians to explore other potential exits and helps determine optimal directions, leading to the high efficiency of exit utilization.

中文翻译：

使用深度强化学习模拟多出口疏散

关于多出口室内疏散的常规模拟主要集中在如何基于不断变化的环境中的众多因素来确定合理的出口。结果通常包括一些拥挤的出口和其他未充分利用的出口，特别是在有大量行人的情况下。我们提出了一种基于深度强化学习（DRL）的多出口疏散模拟，称为MultiExit-DRL，它涉及一个深度神经网络（DNN）框架，以促进状态到动作的映射。DNN框架应用了Rainbow Deep Q-Network（DQN），它是一种DRL算法，它集成了多种高级DQN方法，以提高数据利用率和算法稳定性，并进一步将动作空间划分为八个等距方向，以供行人选择。我们在三种不同的情况下将MultiExit-DRL与两个传统的多出口疏散模拟模型进行了比较：1）改变行人分配比例，2）改变出口宽度比例以及3）改变出口的开放时间表。结果表明，在所有设计的实验中，MultiExit-DRL都具有很高的学习效率，同时减少了疏散帧的总数。此外，DRL的集成使行人可以探索其他潜在出口，并帮助确定最佳方向，从而提高出口利用效率。

更新日期：2020-07-14

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>