当前位置: X-MOL 学术Appl. Intell. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
An adaptive adjustment strategy for bolt posture errors based on an improved reinforcement learning algorithm
Applied Intelligence ( IF 5.3 ) Pub Date : 2020-11-13 , DOI: 10.1007/s10489-020-01906-x
Wentao Luo , Jianfu Zhang , Pingfa Feng , Haochen Liu , Dingwen Yu , Zhijun Wu

Designing an intelligent and autonomous system remains a great challenge in the assembly field. Most reinforcement learning (RL) methods are applied to experiments with relatively small state spaces. However, the complicated situation and high-dimensional spaces of the assembly environment cause traditional RL methods to behave poorly in terms of their efficiency and accuracy. In this paper, a model-driven adaptive proximal proximity optimization (MAPPO) method was presented to make the assembly system autonomously rectify the bolt posture error. In the MAPPO method, a probabilistic tree and adaptive reward mechanism were used to improve the calculation efficiency and accuracy of the traditional PPO method. The size of the action space was reduced by establishing a hierarchical logical relationship for each parameter with a probabilistic tree. Based on an adaptive reward mechanism, the phenomenon that the algorithm easily falls into local minima could be improved. Finally, the proposed method was verified based on the Unity simulation engine. The advancement and robustness of the proposed model were also validated by comparing different cases in simulations and experiments. The results revealed that MAPPO has better algorithm efficiency and accuracy compared with other state-of-the-art algorithms.



中文翻译:

基于改进强化学习算法的螺栓姿态误差自适应调整策略

在装配领域,设计一个智能的自主系统仍然是一个巨大的挑战。大多数强化学习(RL)方法适用于状态空间相对较小的实验。然而,组装环境的复杂情况和高维空间导致传统的RL方法在效率和准确性方面表现不佳。本文提出了一种模型驱动的自适应近端优化(MAPPO)方法,使装配系统能够自动纠正螺栓位置误差。在MAPPO方法中,使用概率树和自适应奖励机制来提高传统PPO方法的计算效率和准确性。通过为每个参数建立带有概率树的分层逻辑关系,可以减少操作空间的大小。基于自适应奖励机制,可以解决该算法容易陷入局部极小现象。最后,基于Unity仿真引擎对提出的方法进行了验证。通过在仿真和实验中比较不同情况,还验证了所提模型的先进性和鲁棒性。结果表明,与其他最新算法相比,MAPPO具有更高的算法效率和准确性。

更新日期:2020-11-13
down
wechat
bug