Font Size: a A A

Research And System Implementation Of Path Planning Based On Deep Reinforcement Learning

Posted on:2022-05-22Degree:MasterType:Thesis
Country:ChinaCandidate:Z MaoFull Text:PDF
GTID:2518306506463374Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Path planning is a very classic problem,which has been widely used in many fields.In recent years,it has attracted a lot of scholars' attention to solve the problem of path optimization by means of deep reinforcement learning technology,and has become a hot topic of path planning.The deep reinforcement learning technology with strong perception ability and strong decision-making ability of intensive learning can not only perceive environment scenarios well,but also make efficient decision-making in path planning.It has strong robustness and generality in solving path planning problems in path planning problem.When using deep Deep Q-Network(DQN)algorithm solve the discrete path planning problem,the network training speed is slow and the training time is long;And when the robot can only obtain the local environment information,the success rate of DQN algorithm is not high.When using the depth deterministic policy gradient(DDPG)algorithm solve the continuous path optimization problem,the training time and the searching time of the network are long.In view of the above problems,the main work of this thesis is as follows:(1)In order to solve the problem of slow convergence and more training rounds in solving the problem of discrete path planning,DQN algorithm based on probability is proposed.By modifying the calculation method of Q value,the more times a state appears,the probability of the state will be reduced,so as to better explore the new state and improve the efficiency of network training.In the experiment,the effectiveness of the algorithm in improving the convergence rate of the network is verified.In order to solve the problem of low success rate of discrete path planning problem when DQN algorithm obtains local environment information,a DQN algorithm based on long short term memory(LSTM)network is proposed.By adding a layer of LSTM network to the network structure,the neural network can process the state and action sequence data.The comparison experiment shows that the algorithm can greatly improve the success rate of the search.(2)In order to solve the problem of continuous path planning,the preset reward function leads to the long training time and the searching time,and proposes the DDPG algorithm based on reward shaping.The reward function is simulated by convolutional neural network,and the reward function is optimized dynamically.The comparison experiments show that the algorithm can reduce the training time and the searching time.(3)The intelligent routing system in unity environment is designed and implemented.Using unity game engine,C sharp programming language,python programming language,tensorflow deep learning framework,machine learning agent(ML?Agent)plug-in design and implement intelligent automatic routing system in unity environment.The system mainly includes four functional modules: client layer,agent layer,interface layer and algorithm layer.It realizes training and finding the end point in the customized unity maze scene.
Keywords/Search Tags:DQN algorithm, DDPG algorithm, Path planning, LSTM network, Reward shaping
PDF Full Text Request
Related items