当前位置: X-MOL 学术J. Intell. Robot. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Soft Actor-Critic for Navigation of Mobile Robots
Journal of Intelligent & Robotic Systems ( IF 3.1 ) Pub Date : 2021-05-14 , DOI: 10.1007/s10846-021-01367-5
Junior Costa de Jesus , Victor Augusto Kich , Alisson Henrique Kolling , Ricardo Bedin Grando , Marco Antonio de Souza Leite Cuadros , Daniel Fernando Tello Gamarra

This paper provides a study of two deep reinforcement learning techniques for application in navigation of mobile robots, one of the techniques is the Soft Actor Critic (SAC) that is compared with the Deep Deterministic Policy Gradients (DDPG) algorithm in the same situation. In order to make a robot to arrive at a target in an environment, both networks have 10 laser range findings, the previous linear and angular velocity, and relative position and angle of the mobile robot to the target are used as the network inputs. As outputs, the networks have the linear and angular velocity of the mobile robot. The reward function created was designed in a way to only give a positive reward to the agent when it gets to the target and a negative reward when colliding with any object. The proposed architecture was applied successfully in two simulated environments, and a comparison between the two referred techniques was made using the results obtained as a basis and it was demonstrated that the SAC algorithm has a superior performance for the navigation of mobile robots than the DDPG algorithm (Code available at https://github.com/dranaju/project).



中文翻译:

移动机器人导航的软演员批评

本文提供了两种在移动机器人导航中应用的深度强化学习技术的研究,其中一种技术是在相同情况下将“软参与者评论家”(SAC)与“深度确定性策略梯度”(DDPG)算法进行比较。为了使机器人能够到达环境中的目标,两个网络都具有10个激光测距结果,将先前的线速度和角速度以及移动机器人相对于目标的相对位置和角度用作网络输入。作为输出,网络具有移动机器人的线速度和角速度。所创建的奖励函数的设计方式是,仅当代理到达目标时才给予正奖励,而与任何对象碰撞时则给其负奖励。

更新日期:2021-05-14
down
wechat
bug