Applied Soft Computing ( IF 5.472 ) Pub Date : 2020-10-17 , DOI: 10.1016/j.asoc.2020.106796 Abderraouf Maoudj; Abdelfetah Hentout
In fact, optimizing path within short computation time still remains a major challenge for mobile robotics applications. In path planning and obstacles avoidance, Q-Learning (QL) algorithm has been widely used as a computational method of learning thought environment interaction. However, less emphasis is placed on path optimization using QL because of its slow and weak convergence towards optimal solutions. Therefore, this paper proposes an Efficient Q-Learning (EQL) algorithm to overcome these limitations and ensure an optimal collision-free path in less possible time. In the QL algorithm, successful learning is closely dependent on the design of an effective reward function and an efficient selection strategy for an optimal action that ensures exploration and exploitation. In this regard, a new reward function is proposed to initialize the Q-table and provide the robot with prior knowledge of the environment, followed by a new efficient selection strategy proposal to accelerate the learning process through search space reduction while ensuring a rapid convergence towards an optimized solution. The main idea is to intensify research at each learning stage, around the straight-line segment linking the current position of the robot to (optimal path in terms of length). During the learning process, the proposed strategy favors promising actions that not only lead to an optimized path but also accelerate the convergence of the learning process. The proposed EQL algorithm is first validated using benchmarks from the literature, followed by a comparison with other existing QL-based algorithms. The achieved results showed that the proposed EQL gained good learning proficiency; besides, the training performance is significantly improved compared to the state-of-the-art. Concluded, EQL improves the quality of the paths in terms of length, computation time and robot safety, furthermore outperforms other optimization algorithms.