Font Size: a A A

Q-learning And Its Application To Local Path Planning Of The AUV

Posted on:2005-03-19Degree:MasterType:Thesis
Country:ChinaCandidate:L XuFull Text:PDF
GTID:2168360125970726Subject:Control theory and control engineering
Abstract/Summary:PDF Full Text Request
Local path planning is a difficult problem in navigation tasks of Autonomous Underwater Vehicle (AUV). The adaptability is important ability for the AUV. Reinforcement learning (RL) is thought to be an appropriate paradigm to acquire control policies for autonomous robots that work in initially unknown environment. The most prevailing RL method is Q-learning, because of its simplicity and well-developed theory. In this paper, reinforcement learning and its application to local path planning of the AUV are researched. The main work accomplished concretely follows:The structure model of reinforcement learning system is investigated and the composition of the reinforcement learning system is determined. The implement method of the input module, reinforcement module, and policy module in reinforcement learning system are discussed. The working principle of reinforcement learning system is analyzed in respect of simple reinforcement learning system.The basic principle and algorithm of Q-learning and related developed algorithms including Q (λ) and SARSA(λ) is studied. We distinguished between two types of RL algorithms: on-policy and off-policy, and the convergence result of SARSA(0) algorithm was discussed. The theory foundation and implement methods are presented to apply reinforcement learning to on-line decision-making system.Aiming at the slow convergent rate of standard Q-learning, multi-step on-policy SARSA(λ) reinforcement algorithm is adopted. This algorithm has some characters of faster convergent rate, smooth running track and safer training process than standard Q-learning. CMAC is a local generalization neural network. Comparing with BP network, it has quicker convergent rate and better adaptability, thus it is suitable to dynamic real time on-line control.In this paper, a multi-step on-policy reinforcement learning method with continuous actions is presented that works in continuous domains. It represents continuous input spaces and Q functions by means of a CMAC neural network. It generates continuous-value actions using the same data structures as the standard discrete-action Q-learning neural network. As a consequence, our approach is quite efficient computationally for the task of local path planning. Simulation results in robot domain show the superiority of this method over the other discrete-action version in terms of both asymptotic performance and speed of learning.The method is first presented and realized that uses the path planning network and the wall-following network to resolve local path learning of a mobile robot in the environment with complex obstacles. The simulation results demonstrate the efficiency of the algorithm.
Keywords/Search Tags:Reinforcement learning, Q-learning, SARSA(λ)algorithm, CMAC neural network, AUV, Local path planning
PDF Full Text Request
Related items