当前位置: X-MOL 学术Ind. Rob. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Integrating human experience in deep reinforcement learning for multi-UAV collision detection and avoidance
Industrial Robot ( IF 1.8 ) Pub Date : 2021-09-24 , DOI: 10.1108/ir-06-2021-0116
Guanzheng Wang 1 , Yinbo Xu 1 , Zhihong Liu 1 , Xin Xu 1 , Xiangke Wang 1 , Jiarun Yan 1
Affiliation  

Purpose

This paper aims to realize a fully distributed multi-UAV collision detection and avoidance based on deep reinforcement learning (DRL). To deal with the problem of low sample efficiency in DRL and speed up the training. To improve the applicability and reliability of the DRL-based approach in multi-UAV control problems.

Design/methodology/approach

In this paper, a fully distributed collision detection and avoidance approach for multi-UAV based on DRL is proposed. A method that integrates human experience into policy training via a human experience-based adviser is proposed. The authors propose a hybrid control method which combines the learning-based policy with traditional model-based control. Extensive experiments including simulations, real flights and comparative experiments are conducted to evaluate the performance of the approach.

Findings

A fully distributed multi-UAV collision detection and avoidance method based on DRL is realized. The reward curve shows that the training process when integrating human experience is significantly accelerated and the mean episode reward is higher than the pure DRL method. The experimental results show that the DRL method with human experience integration has a significant improvement than the pure DRL method for multi-UAV collision detection and avoidance. Moreover, the safer flight brought by the hybrid control method has also been validated.

Originality/value

The fully distributed architecture is suitable for large-scale unmanned aerial vehicle (UAV) swarms and real applications. The DRL method with human experience integration has significantly accelerated the training compared to the pure DRL method. The proposed hybrid control strategy makes up for the shortcomings of two-dimensional light detection and ranging and other puzzles in applications.



中文翻译:

将人类经验整合到深度强化学习中,用于多无人机碰撞检测和避免

目的

本文旨在实现基于深度强化学习(DRL)的全分布式多无人机碰撞检测与避让。解决DRL中样本效率低的问题,加快训练速度。提高基于 DRL 的方法在多无人机控制问题中的适用性和可靠性。

设计/方法/方法

在本文中,提出了一种基于 DRL 的多无人机全分布式碰撞检测和避免方法。提出了一种通过基于人类经验的顾问将人类经验整合到政策培训中的方法。作者提出了一种混合控制方法,它将基于学习的策略与传统的基于模型的控制相结合。进行了广泛的实验,包括模拟、真实飞行和比较实验,以评估该方法的性能。

发现

实现了一种基于DRL的全分布式多无人机碰撞检测与避让方法。奖励曲线显示,整合人类经验时的训练过程明显加快,平均情节奖励高于纯 DRL 方法。实验结果表明,融合人类经验的 DRL 方法在多无人机碰撞检测和避让方面比纯 DRL 方法有显着提升。而且,混合控制方式带来的飞行安全性也得到了验证。

原创性/价值

全分布式架构适用于大规模无人机(UAV)群和实际应用。与纯 DRL 方法相比,具有人类经验集成的 DRL 方法显着加快了训练速度。所提出的混合控制策略弥补了二维光检测与测距等应用中的难题。

更新日期:2021-09-24
down
wechat
bug