Font Size: a A A

Research On Adaptive Software Model Of Mobile Robot Based On Reinforcement Learning

Posted on:2019-01-17Degree:MasterType:Thesis
Country:ChinaCandidate:W Y YueFull Text:PDF
GTID:2348330542472650Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The adaptive navigation and obstacle avoidance capability of mobile robot in unknown environment determines its degree of intelligence.Improving the ability of adaptive navigation and obstacle avoidance for mobile robots has always been a hot topic in the field of robotics.At present,the main adaptive navigation and obstacle avoidance algorithms are artificial potential method,fuzzy rule control method,genetic algorithm and so on.However,the adaptive ability of these algorithms is not strong.Those algorithm require more prior knowledge of the environment,which makes it difficult to achieve adaptive navigation and obstacle avoidance in completely unknown environments.Reinforcement learning is a machine learning algorithm that adjusts its own tactics through interaction with the environment and finally finds the optimal strategy to achieve the goal.The adaptive navigation and obstacle avoidance algorithm based on reinforcement learning can be a good solution to the above-mentioned problems.However,the reinforcement learning method has the disadvantage of too long training time and slow convergence efficiency.The research direction of this paper is to use the Q-learning algorithm in reinforcement learning to realize the navigation and obstacle avoidance of mobile robot in unknown environment,and optimize the Q-learning algorithm to improve its convergence speed.This paper first defines the basic elements of reinforcement learning,such as the operating environment,the form of action and the reward function of the mobile robot.Then,two improved Q-learning optimization models are put forward: 1)Q-learning model based on task additional reward function;2)hybrid Dyna model based on Q-learning.Finally,the experiments based on the two models is carried out on HBE-SmartCAR mobile robot.The experimental results show that these two models can effectively improve the convergence efficiency of the algorithm.The main contributions of this article are as follows:(1)The ARIMA model is used to predict the sonar data of the mobile robot,which reduces the data noise and improves the stability of the algorithm.(2)An additional reward function based on the target task is proposed,which improves theconvergence speed of the algorithm without relying on the prior environment.(3)A hybrid Dyna algorithm based on Q-learning is proposed,the hybrid model are improved in the following aspects: a)a CMAC neural network is employed to simulate the environment,which is used to update the action value in the training process;b)a priority queue is employed to reduce the randomness in the ordinary Dyna model of the planning update;c)a value function based motion target initialization method is used to initialize the Q table,which can reduces the blindness of the preliminary exploration of the mobile robot without the aid of the priori environment information;d)a heuristic action selection method based on neural network is proposed to improve the efficiency of mobile robot action selection;e)the additional reward function based on the target is used to improve the effectiveness of the reward value.
Keywords/Search Tags:reinforcement learning, mobile robot, adaptive obstacle avoidance and navigation, Q-learning, Dyna, CMAC neural network
PDF Full Text Request
Related items