Research On Multiagent Reinforcement Learning Algorithm In Continuous Action Space

Posted on:2019-06-05

Degree:Master

Type:Thesis

Country:China

Candidate:G S Liu

Full Text:PDF

GTID:2428330626952123

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

Many real-world problems,such as urban traffic control,network packet delivery and video games are naturally modeled as multiagent systems.In multiagent systems,an agent often needs to coordinate with other agents.Until now significant efforts have been devoted to multiagent coordination problems.Most of these approaches are extended from Q-learning,such as distributed Q-learning,Policy Hill Climbing(PHC)and recursive Frequency Maximum Q-Value(rFMQ).However,these algorithms can only handle the multiagent coordination problem in discrete action spaces.Nowadays,a number of researches have been devoted to dealing with the single-agent learning problems in continuous action spaces,which can be divided into two major categories.The most common algorithms are based on function approximation technology,which can be classified into two categories: value approximation algorithms and policy approximation algorithms.The others are based on the Monte Carlo sampling method.All these algorithms are designed for singleagent environment and cannot be applied in multiagent systems directly.In this thesis,we propose a framework to solve efficiently solve the problem of multiagent coordination in continuous action space.According to this framework,we propose a novel algorithm called Continuous Action Learning Automata with recursive Frequency Maximum Q-Value(CALA-rFMQ)that leverages the advantage of existing multiagent discrete-action learning algorithms and single agent continuous action algorithms.The CALA rFMQ agent first samples uniformly in the continuous action space to get several discrete actions.Secondly,each agent selects the expected actions from these sampled actions and divides the continuous action space into a lot of sub-continuous action spaces.In this step,we extend the idea of the rFMQ and propose the Win or Learn Slow Policy Hill Climbing(WoLS-PHC)to evaluate each sampled actions based on the expected rewards.Finally,CALA-rFMQ transfers the prior knowledge from the expected actions of previous step to continuous action spaces as the initialization.In this step,we extend CALA with Win or Learn Slow(WoLS)principle to explore the final optimal actions from the sub-continuous action spaces,which improve the efficiency of exploration of CALA.The experiments are mainly designed in single-state continuous action version climbing games and multiple state cooperative the boat game.Experimental results show that our algorithm can quickly converge to global optimum and outperform the previous work...

Keywords/Search Tags:

Multiagent systems, Continuous action space, Reinforcement learning, Coordination game

PDF Full Text Request

Related items

1	Encoding Robot Topology Information For Deep Reinforcement Learning With Continuous Action Space
2	Reaearch On Deep Reinforcement Learning Algorithm In Continuous Action On Space
3	Scaling multiagent reinforcement learning
4	Multi-agent Coordination Through Decoupled Reinforcement Social Learning
5	Learning state and action space hierarchies for reinforcement learning using action -dependent partitioning
6	Research On Multiagent Cooperation And Applications Based On Reinforcement Learning
7	On Reinforcement Learning Control For Bionic Underwater Robots
8	Neural Network-Based Research On Reinforcement Learning In Continuous State Space
9	Research On Value Function Approximation Methods In Reinforcement Learning
10	Research On Reinforcement Learning In Continuous Spaces