Font Size: a A A

Deep Reinforcement Learning Method For Robot Control

Posted on:2022-04-10Degree:MasterType:Thesis
Country:ChinaCandidate:H HuFull Text:PDF
GTID:2518306740999079Subject:Control Engineering
Abstract/Summary:PDF Full Text Request
After long-term development,traditional robot control methods have been able to achieve relatively stable and rapid robot control in a structured unknown environment.However,in an unstructured environment and an unknown dynamic environment,traditional robot control methods rely too much on the environment model,which directly leads to a reduction in the speed and accuracy of control.In order to solve such problems,it is necessary to expand the behavioral environment of robot control tasks.Combined with the rapid development of machine learning methods in recent years,the use of data-driven learning methods to control robots has gradually become a hot research field.The deep reinforcement learning method does not need to accurately model the controlled object in advance,but only needs to allow the controlled robot to interact with the environment,and continuously optimize and update the control strategy through interactive data.It provides new solutions for dynamic modeling,environment perception and multi-robot collaboration in robot control tasks.Although deep reinforcement learning has many advantages,it still faces many problems in actual robot control applications.Mainly manifested in: control task allocation design;neural network structure design;maintaining the stability and safety of the controlled object during the strategy iteration process and so on.Based on the existing deep reinforcement learning,this paper combines the application of robot control in the actual environment.The research work and innovation points are as follows:A multi-scenario-multi-stage training mechanism is proposed,and a parallel proximal policy optimization algorithm based on centralized training-distributed execution has realized the obstacle avoidance and navigation ability of the robot colony in a complex environment.The strategy learned from simple scenarios in the first stage,and then the strategy continues to be optimized in the second stage with more complex scenarios.Multi-stage training accelerates the convergence speed of obstacle avoidance strategies,and can obtain higher cumulative reward values,so that Robot bee colony realizes obstacle avoidance and navigation ability in unknown environment.It provides a centralized training-distributed execution solution for obstacle avoidance navigation of multi-robot groups in large-scale scenarios.Proximal policy optimization algorithm with integral compensation is proposed to solve the motion control problem of a quadrotor UAV when the precise model is unknown.Aiming at the dynamic performance of quadrotors such as underdrive,instability,and non-linearity,based on the near-end strategy optimization algorithm,the deep neural network is used to map the UAV state observation to the action control quantity,and intensive training is carried out according to the reward signal.Aiming at the steady-state error of the controller trained by the original near-end strategy optimization algorithm,a near-end strategy optimization algorithm with integral compensation is designed to successfully eliminate the steady-state error.In addition,a two-stage training mechanism is proposed,which combines offline pre-training and online real-time optimization.Finally,a robust motion controller with various anti-interference capabilities with good dynamic performance is obtained.Propose a hybrid strategy method for collaborative control of UAV robots.According to the mission requirements,the autonomous tracking and landing tasks of the UAV are decomposed into an autonomous tracking module and a landing and landing module.The autonomous tracking motion controller is designed based on the near-end strategy optimization algorithm.The observation input of the airborne radar is mapped into the UAV movement through the deep neural network.At the same time,the landing module is designed based on heuristic rules.While the UAV makes tracking instructions,it also constantly adjusts its altitude according to the rules to complete the task of tracking and landing.The proposed method is a deep reinforcement learning method combined with rules,which has strong autonomous landing ability and adaptability in unpredictable harsh environments.At the same time,other deep reinforcement learning algorithms can be applied to network training,which is beneficial to improve the versatility of control framework.
Keywords/Search Tags:deep reinforcement learning, robot control, obstacle avoidance navigation, tracking landing
PDF Full Text Request
Related items