Font Size: a A A

Research On Path Planning Technology Based On Deep Reinforcement Learning

Posted on:2020-05-21Degree:MasterType:Thesis
Country:ChinaCandidate:C Y LiuFull Text:PDF
GTID:2518306131966059Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
In recent years,UAV have been widely used in agriculture,commerce,military and other fields due to their low manufacturing cost,flexibility and portability.With the promotion of application fields,the environments are becoming more and more complicated.How to realize the autonomous flight of UAV in the complex and unknown environment where it is difficult to obtain accurate geographic information and GPS signals is the technical difficulties that the field must overcome.to further expand the application of UAV.In recent years,the breakthrough of artificial intelligence technology represented by deep learning in the field of speech and image has triggered the third wave of artificial intelligence research.Artificial intelligence technology is being applied to various fields and deeply affects every aspect of life and production of human society.Deep reinforcement learning combined with deep learning's data feature extraction ability and reinforcement learning's planning decision-making ability is widely used to deal with sequence decision problems in high-dimensional continuous space,and has great development potential in the field of path planning.Based on the deep deterministic policy gradient algorithm framework,this paper proposes an improved deep reinforcement learning algorithm,Deep Deterministic Policy Gradient Filter K-Means Priority(DDPG-FKP).Aiming at the slow convergence efficiency of the traditional depth enhancement algorithm,this algorithm proposes a radar-assisted experience pool optimization scheme,set the experience pool storage threshold according to the distance between the UAV and the obstacle to achieve the screening of the sample data,and then use the K-Means algorithm to cluster the data in the experience pool.Finally,the data sampling priority is divided according to the time difference error(TD-error).Thereby improving the sample quality of the training set in the experience pool and accelerating the convergence rate of the algorithm.Aiming at the single reward function is used for the traditional deep reinforcement learning framework,which presents a general problem in the decision-making effect under complex tasks.A segmented reward function improvement scheme is proposed.By mimicking the process of human learning driving techniques,different reward functions for different learning stages are designed.Increase the accuracy of algorithmic decisions in a step-by-layer manner.The simulation results show that the proposed scheme improves the convergence efficiency of the algorithm by 35.4% compared with the comparison algorithm.The test flight success rate has increased by 17%.It can effectively improve the autonomous flight capability of UAV in complex environments.
Keywords/Search Tags:Deep Reinforcement Learning, UAV, Path Planning
PDF Full Text Request
Related items