A Research Of Deep Reinforcement Learning Algorithms In Combination With Multi-relations

Posted on:2021-07-02

Degree:Master

Type:Thesis

Country:China

Candidate:Y H Geng

Full Text:PDF

GTID:2518306476952969

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

Deep Reinforcement Learning(DRL)combines the perceptual of deep learning with the decision-making ability of reinforcement learning to address complex decision-making problems that are difficult to solve by traditional reinforcement learning.However,the deep reinforcement learning models still have limitations such as low sample efficiency and insufficient generalizability of small changes in the environment.These limitations are mainly reflected in the fact that deep reinforcement learning tends to overfit training data.Thus,it fails to solve problems with complex logical and relational structures.To deal with these problems,Deep Mind proposed a novel Relational Reinforcement Learning(RRL)algorithm in '18.The algorithm advocates learning and reusing entities-and relations-centered functions,which implies relational reasoning.The algorithm greatly improves the performances,sample efficiency,and the capability of induction.Relational reasoning is essential for the success of RRL models whose computations focus explicitly on binary relations of entity pairs.In the real world,multiple relations are more indicative of the way that entities are connected.Despite the increasing numbers of considerable entities in relational reasoning that would benefit reinforcement learning,it comes with a high cost of exponential growth in computation complexity.Therefore,computing multiple relations while maintaining computational efficiency is a huge challenge.On the other hand,deep reinforcement learning in the multi-agent environment will face more difficult challenges than in a single-agent environment,including the non-stationary and multi-agent credit assignment.In the field of multi-agent,there is no relevant research on RRL.In summary,this paper proposes the corresponding multi-relational DRL algorithm in the field of single-agent and multi-agent,in response to the above challenges.In a single-agent environment,the current RRL model gives agents the ability to reason through the use of a multi-head dot product attention mechanism(MHDPA).However,this mechanism only considers binary relations and it is difficult to extend to multiple relations.This paper proposes a new relational module called the Multiple Relational Core(MRC)and combines it with the classical A3 C algorithm to form a complete single-agent multiple relations actor-critic(MRAC)algorithm.MRC has two main characteristics: multiple relations were efficiently approximated;the computational complexity of the module will not grow exponentially with the increasing of depth degree.To further reduce the computational complexity of MRC,by optimizing the internal structure of the model,we propose a single-headed simplified version of the multiple relational core(MRCSH).MRCSH can reduce the overhead of computation while preserving the capacity of relational reasoning.In a multi-agent environment,a novel multi-agent multi-relational actor-critic algorithm is proposed based on the single-agent algorithm Soft Actor-Critic(SAC)in this paper.A multi-relational mechanism is shared among all agent's critics.The mechanism dynamically selects the agent that needs attention by calculating the relations between agents,which enhances the ability of communication and cooperation,and thus improving the algorithm performance.In addition,the MMRAC algorithm further enhances the stability of the model by introducing a new multi-agent advantage function,which helps to solve the problem of multi-agent credit assignment.MMRAC is flexible enough to be used in most multi-agent learning tasks.Finally,the effectiveness of MRAC algorithm and MMRAC algorithm is verified in different game environments.In single-agent environment,experimental results show that MRAC algorithm not only learn better strategies,but also improve the sample efficiency and the generalization ability of the model.At the same time,compared with MRC module,MRCSH can maintain the performance of the model and reduce the time of convergence.In multi-agent environment,the performance of MMRAC algorithm is better than other compared algorithms,and has better robustness and scalability.

Keywords/Search Tags:

Reinforcement Learning, Relational Learning, Deep Learning, Artificial Intelligence

PDF Full Text Request

Related items

1	Research And Application Of Game Artificial Intelligence System Based On Machine Learning Methods
2	Reinforcement Learning Agent Design Based On Deep Perception And Imitation Learning
3	Research On Command Decision Method From RTS Perspective On Deep Learning
4	Research On Chess Game Based On Deep Reinforcement Learning
5	Research On Multi-Agent Deep Reinforcement Learning Methods And Applications
6	Supervised Reinforcement Learning:methods And Applications
7	Sample Efficiency Improvement Method Of Deep Reinforcement Learning And Its Application In Video Bitrate Control
8	Research On Reinforcement Learning Based Control Method Of Magnetic Navigation AGV
9	Research On Security Deep Reinforcement Learning Based On Experiences
10	Integrating complexity science and artificial intelligence: GIS, agents and reinforcement learning for modeling forest cover change