Autonomous obstacle avoidance of UAV based on deep reinforcement learning,Journal of Intelligent & Fuzzy Systems

当前位置： X-MOL 学术 › J. Intell. Fuzzy Syst. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Autonomous obstacle avoidance of UAV based on deep reinforcement learning
Journal of Intelligent & Fuzzy Systems ( IF 1.7 ) Pub Date : 2021-08-28 , DOI: 10.3233/jifs-211192
Songyue Yang ₁ , Guizhen Yu ₁ , Zhijun Meng ₂ , Zhangyu Wang ₁ , Han Li ₁

Affiliation

In the intelligent unmanned systems, unmanned aerial vehicle (UAV) obstacle avoidance technology is the core and primary condition. Traditional algorithms are not suitable for obstacle avoidance in complex and changeable environments based on the limited sensors on UAVs. In this article, we use anend-to-end deep reinforcement learning (DRL) algorithm to achieve the UAV autonomously avoid obstacles. For the problem of slow convergence in DRL, a Multi-Branch (MB) network structure is proposed to ensure that the algorithm can get good performance in the early stage; for non-optimal decision-making problems caused by overestimation, the Revise Q-value (RQ) algorithm is proposed to ensure that the agent can choose the optimal strategy for obstacle avoidance. According to the flying characteristics of the rotor UAV, we build a V-Rep 3D physical simulation environment to test the obstacle avoidance performance. And experiments show that the improved algorithm can accelerate the convergence speed of agent and the average return of the round is increased by 25%.

中文翻译：

基于深度强化学习的无人机自主避障

在智能无人系统中，无人机避障技术是核心和首要条件。传统算法不适合基于无人机上有限传感器的复杂多变环境中的避障。在本文中，我们使用端到端的深度强化学习（DRL）算法来实现无人机自主避障。针对DRL中收敛慢的问题，提出了一种Multi-Branch（MB）网络结构，保证算法在早期就可以获得良好的性能；针对高估导致的非最优决策问题，提出了Revise Q-value(RQ)算法来保证agent能够选择最优的避障策略。根据旋翼无人机的飞行特性，我们构建了一个 V-Rep 3D 物理模拟环境来测试避障性能。并且实验表明，改进后的算法可以加快agent的收敛速度，回合的平均回报提高了25%。

更新日期：2021-09-03

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11