Font Size: a A A

Improvement Of Q-learning Reinforcement Learning Algorithm And Its Application In Unmanned Vehicles Path Planning

Posted on:2019-11-26Degree:MasterType:Thesis
Country:ChinaCandidate:X M LiuFull Text:PDF
GTID:2492306047451834Subject:Control Engineering
Abstract/Summary:PDF Full Text Request
In recent years,unmanned vehicles have gradually become a hot branch of the artificial intelligence field,and the unmanned vehicle path planning is a more essential part.In this thesis,unmanned vehicles path planning and reinforcement learning method are combined to realize unmanned vehicles can self-learn the current environment in an unknown environment and find the optimal path to reach the destination.The reinforcement learning method can eliminate the modeling process of complex tasks or unknown tasks and obtain the optimal strategy to solve the problem through self-learning.Duo to unmanned vehicles use common reinforcement learning algorithms such as Q-learning algorithm can not adapt to the dynamic environment,that is,when the environment changes,unmanned vehicles need to relearn the current environment.Even if the environment has only a slight change,the car needs to relearn as treating a new environment.Therefore,the improvement of reinforcement learning algorithm to adapt to the changing environment has become of great practical significance.First of all,we present a hierarchical approach to the environment,and apply the idea to the epsilon greedy exploration algorithm.A layered exploration algorithm is proposed,that is,hierarchical epsilon greedy exploration algorithm.In this algorithm,a complex environment is divided into several levels from the detailed lower level to the high level of abstractions,thus Agent can learn the detailed planning path in the lower level environment while the high level abstracts the environment.When the environment changes,the high lever can guide the Agent to learn the changed environment which retains the abstraction of the environment,thus the algorithm improves the learning efficiency of the unmanned vehicles.Secondly,the hierarchical epsilon greedy exploration algorithm is applied to the Q-learning algorithm to get a new Q-learning algorithm,that is,the hierarchical epsilon greedy Q-learning algorithm.Since the core of reinforcement learning lies in the action selection strategy of Agent,this algorithm adopts a hierarchical epsilon greedy exploration algorithm to select actions in each environment level,and guides Agent selection actions from the whole perspective of environment.This algorithm not only retains the excellent learning ability of the Q-learning algorithm,but also improves the algorithm’s convergence speed and environmental adaptability by adopting a hierarchical exploration method.Finally,the proposed hierarchical epsilon greedy Q-learning algorithm is used in the path planning of unmanned vehicles.We compare the hierarchical epsilon greedy Q-learning algorithm and Q-learning algorithm from the learning speed and environmental adaptability perspective.The simulation results show that the unmanned vehicle based on the hierarchical epsilon greedy Q-learning algorithm is greatly improved in learning speed and environmental adaptability.
Keywords/Search Tags:reinforcement learning, unmanned vehicles path planning, hierarchical environment, learning speed, environmental adaptability
PDF Full Text Request
Related items