当前位置: X-MOL 学术Arab. J. Sci. Eng. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Stochastic Genetic Algorithm-Assisted Fuzzy Q -Learning for Robotic Manipulators
Arabian Journal for Science and Engineering ( IF 2.6 ) Pub Date : 2021-02-10 , DOI: 10.1007/s13369-021-05379-z
Amit Kukker , Rajneesh Sharma

This work proposes stochastic genetic algorithm-assisted Fuzzy Q-Learning-based robotic manipulator control. Specifically, the aim is to redefine the action choosing mechanism in Fuzzy Q-Learning for robotic manipulator control. Conventionally, a Fuzzy Q-Learning-based controller selects a deterministic action from available actions using fuzzy Q values. This deterministic Fuzzy Q-Learning is not an efficient approach, especially in dealing with highly coupled nonlinear systems such as robotic manipulators. Restricting the search for optimal action to the agent’s action set or a restricted set of Q values (deterministic) is a myopic idea. Herein, the proposal is to employ genetic algorithm as stochastic optimizer for action selection at each stage of Fuzzy Q-Learning-based controller. This turns out to be a highly effective way for robotic manipulator control rather than choosing an algebraic minimal action. As case studies, present work implements the proposed approach on two manipulators: (a) two-link arm manipulator and (b) selective compliance assembly robotic arm. Scheme is compared with baseline Fuzzy Q-Learning controller, Lyapunov Markov game-based controller and Linguistic Lyapunov Reinforcement Learning controller. Simulation results show that our stochastic genetic algorithm-assisted Fuzzy Q-Learning controller outperforms the above-mentioned controllers in terms of tracking errors along with lower torque requirements.



中文翻译:

随机遗传算法辅助的模糊Q学习器

这项工作提出了基于随机遗传算法的基于模糊Q学习的机器人操纵器控制。具体而言,其目的是重新定义模糊Q学习中用于机械手控制的动作选择机制。常规地,基于模糊Q学习的控制器使用模糊Q值从可用动作中选择确定性动作。这种确定性的模糊Q学习并不是一种有效的方法,尤其是在处理高度耦合的非线性系统(例如机器人操纵器)时。将针对最佳操作的搜索限制为座席的操作集或受限的Q价值观(确定性)是一种近视思想。在此,建议采用遗传算法作为随机优化器,用于基于模糊Q学习的控制器的每个阶段的动作选择。事实证明,这是用于机器人操纵器控制的高效方法,而不是选择代数最小动作。作为案例研究,当前的工作在两个操纵器上实现了建议的方法:(a)两连杆臂操纵器和(b)选择性依从性装配机械臂。将方案与基线模糊Q学习控制器,基于Lyapunov Markov游戏的控制器和语言Lyapunov强化学习控制器进行比较。仿真结果表明,我们的随机遗传算法辅助的模糊Q-学习控制器在跟踪误差以及较低的扭矩要求方面均优于上述控制器。

更新日期:2021-02-10
down
wechat
bug