Research And Implementation Of Agent Continuous Control Technology Based On Distributed Reinforcement Learning

Posted on:2022-04-28

Degree:Master

Type:Thesis

Country:China

Candidate:T Q Xu

Full Text:PDF

GTID:2518306566492214

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

Deep reinforcement learning,with its decision-making ability and deep learning perception,realizes the end-to-end learning mode from input to output and has a natural advantage in solving complex unmanned equipment control problems.At present,more and more unmanned equipment control solutions have been changing from traditional control methods to deep reinforcement learning methods,but using deep reinforcement learning to solve continuous control tasks is facing the problem of "dimension disaster".The continuous control tasks such as manipulator,simulation robot,UAV,and other intelligent unmanned equipment have complex motion control problems.At the same time,the friction force,the rotational friction of joint twist,and the voltage change of the UAV wing lead to a long time of reinforcement learning and training,and it is difficult to converge to a robust model.With the development of demand,distributed reinforcement learning methods have attracted more and more attention from researchers.More and more parallel methods and high-performance computing framework are used to solve the problem of long training time in deep reinforcement learning.Therefore,this paper studies the problem of continuous control tasks and proposes a Learning-Actor asynchronous method based on importance-sampling and a population-based evolutionary policy search method.On this basis,an extensible prototype system of multilearner and multi-actor reinforcement learning training framework is designed and implemented.Under the simulation environment,the system is designed and implemented,the results of this paper are verified by experiments.The contribution of this paper includes the following three points:(1)In view of the problem of long training time of continuous control task,a distributed method of Learner-Actor based on importance sampling is proposed.Based on importance sampling and V-trace method,the state is processed,and the action distribution is simulated by state feature encoding.The exploration space of the agent is increased by random sampling of distribution,and the environment state is replayed and transmitted.The method of input increases the stability of the output of the action,and then realizes the asynchronous sampling training method in the continuous action space.(2)In view of the difficulty of policy search,a population-based evolutionary strategy policy search method is proposed.The core idea of this method is: through parallel execution of multiple agents for training,regularly selecting the optimal agent,combining with other agents to generate new individuals,forming the next generation of population,adopting the measures of selecting and evolving,promoting the training to the optimal direction.The algorithm improves the performance of off-policy algorithm by evolutionary strategy search based on group guidance,which makes it adaptable and extensible in complex continuous control tasks.(3)Based on the above results,a prototype system with multi-learner and multi-actor reinforcement learning training framework is designed and implemented.The performance test of the algorithm is carried out through the prototype system.The experimental verification is carried out in the control of a more complex four-rotor simulation UAV environment and the sparse reward robot simulation environment closer to the real world.The experimental results show that compared with the traditional reinforcement learning algorithm,the proposed algorithm has obvious improvement in performance and robustness,and has good scalability.

Keywords/Search Tags:

Continuous Control Task, Distributed Reinforcement Learning, Importance Sampling, Parallel Training, Policy Search

PDF Full Text Request

Related items

1	Research On Deterministic Policy Gradient Algorithms With Continuous Control Task
2	Research On Continuous Robot Control Algorithms Based On Reinforcement Learning
3	Research On Off-policy Evaluation Based On Key Trajectory Mining
4	Research On Reinforcement Learning Methods Based On Direct Policy Search
5	Research On Agent Decision-making And Control Based On Deep Reinforcement Learning
6	Distributed Decision-making Control System For The Cooperative Task Of Multi-mobile Robots
7	Research On Parallel Reinforcement Learning
8	Research On Distributed Deep Learning Task Assignment Algorithm Based On Blockchain
9	Research On Task Scheduling Algorithms For Distributed Systems Based On Computational Intelligence
10	Exploration Strategy Of Deterministic Policy In Deep Reinforcement Learning