Font Size: a A A

Research On Autonomous Flight Method Of Drone With Deep Reinforcement Learning

Posted on:2020-03-12Degree:MasterType:Thesis
Country:ChinaCandidate:R T ZhuFull Text:PDF
GTID:2392330620960689Subject:Aeronautical and Astronautical Science and Technology
Abstract/Summary:PDF Full Text Request
Autonomous flight technology plays an important role in the application of UAV(Unmanned Aerial Vehicle)such as autonomous electric wire inspection and UAV mapping.Traditional autonomous flight methods are based on accurate mathematical model which is based on sufficient sensor information.However,it is often difficult to establish an effective mathematical model of autonomous control in some relatively complex situation.In these cases,the autonomous drone flight methods based on various machine learning algorithms has become a research hotspot in recent years.Reinforcement learning(RL)is one of the most widely used machine learning method in the field of robotic autonomous control.The deep Reinforcement learning method combines the advantages of deep learning and Reinforcement learning.Deep Reinforcement learning method has been greatly developed in recent years and bringing new ideas to the research of autonomous flight technology of drones.Based on the kinematics of the quadrotors,this paper models the autonomous flight process of the quadrotor.The mathematical model is used to simplify the autonomous flight process of the drone and model it to the Markov decision process,so that it can be optimized by the Reinforcement learning method.In order to simplify the solution process,this paper simplifies the quadrotors control model and reduces the quadrotor control channels from a four-degree-of-freedom continuous channels to a one-degree-of-freedom continuous channel.In order to verify the effect of the deep Reinforcement learning method,this paper designs a specific task scenario: quadrotors autonomous flight in the complex urban scene while avoiding obstacle.Based on the Actor-Critic architecture,this paper proposed a framework of autonomous flight algorithm for quadrotor using monocular image data.The framework includes a policy network and a evaluation network.The policy network is used to generating control commands,and the evaluation network is used to evaluate the control result of the drone.In order to speed up the learning of the algorithm,this paper uses the online learning method based on temporal differential sampling algorithm to train the evaluation network.The policy network is optimized based on policy gradient which is derived from the output of the evaluation network.The policy network and the evaluation network are updated alternately until both are converged.In order to prevent the autonomic control model from converged to local optimum,this paper uses the ?-greedy algorithm to process the output of the policy network while training.Drone has certain probability to execute the optimal control command derived from the policy network,and in other cases,the stochastic command.This algorithm can achieve a balance between exploration and exploitation.Experimental results show that the deep Reinforcement learning framework is effective in the area of autonomous flight of drones.In view of the shortcomings of our algorithm,three methods are proposed for improving in this paper.Aiming at the problem of the high delay of the existing deep Reinforcement learning method on the UAV platform,this paper proposes a neural network acceleration method.The data utilization efficiency of the existing Reinforcement learning algorithm is very low.In response to this problem,we propose to use the eligibility trace model to improve this shortcoming.At last,we proposed a multimodal policy network to implement closed-loop control.The experimental results show that the improved algorithm has been greatly improved in performance compared to the original algorithm.Aiming at autonomous flight methods of quadrotor in complex scenes,this paper proposes a deep Reinforcement learning based quadrotor autonomous flight algorithm.This paper completed the mathematical modeling of the autonomous flight process of the drone and established a mathematical description suitable for the deep reinforcement learning framework.Aiming at the shortcomings of the proposed algorithm three improvement method is proposed.The experimental results show that the proposed algorithm is better than traditional algorithms in real-time and robustness.
Keywords/Search Tags:Markov decision process, multimodal, deep Reinforcement learning, autonomous flight, quadrotor
PDF Full Text Request
Related items