Font Size: a A A

Multi-agent Reinforcement Learning Based On Gaussian Regression In Continuous Spaces

Posted on:2014-04-30Degree:MasterType:Thesis
Country:ChinaCandidate:H J WeiFull Text:PDF
GTID:2268330425973203Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Abstract:Aiming at multi-agent system(MAS), which is widespread in our daily lives, this thesis studies learning behavioral strategies of MAS by reinforcement learning(RL), and the key of this study is on generalization and "curse of dimensionality". The initial, RL theory and its related definition were discussed just based on discrete environment, and the practical application environment of inner continuity has greatly limited the range of RL, so that generalization has became the key to improve the practicability of RL. At the same time, with the progress of the MAS theory research, RL theory has developed from simple single agent reinforcement learning(SARL) to complicated multi-agent reinforcement learning(MARL). However, learning and storage space will increase exponentially as the number of agents in MAS,"curse of dimensionality" become more prominent, which may leads to low learning efficiency, and even destroy the learning convergence.This thesis focus on such generalization and "curse of dimensionality" in MAS. On the one hand, V-function model and Q-function model are built real time in order to generalize state and action spaces. On the other hand, Q-function with dimension reduced is developed to realize reducing dimensions of learning space and storage space, and the learning based on model improve learning efficiency.Firstly, based on RL basic definition, combining MAS application environment, the general framework of MARL and its corresponding to typical algorithms are discussed. Then, nature of generalization and "curse of dimensionality" are analyzed, and general idea and theoretical guidance is given to solve them.Secondly, assumes that joint reward function has been known and the learning agent performs static stability strategy, based on the Q value function with dimension reduced, a learning algorithm named tracking learning based on gaussian regression for MAS in continuous spaces is presented. To realize generalization, model of value functions is constructed by using gaussian regression, the time complexity and the space complexity are analyzed for algorithm performance.Thirdly, to further extend the algorithm adaptability and breakthrough the above assumptions, a modified algorithm named model-based reinforcement learning with companion’s policy tracking for MAS(MAS MBRL-CPT)is proposed. A new expected immediate reward is defined which merges the observation on companion’s policy into the payoff fed back from the environment, and whose value is estimated online by stochastic approximation. In addition, the behavior strategy model is established online, which is used to the renewal of the sample space.Then, introducing a coordination mechanism called Time-Sharing Learning to MAS MBRL-CPT algorithm, so that all agents could take turn to update their response strategies by learning and optimize the cooperation strategy. Moreover, the optimal cooperation strategy is formed by simultaneous-learning.Finally, in the simulation of Multi-cart-pole&Line-up in continues space, the performance of proposed algorithms shows their high efficiency and good generalization ability.
Keywords/Search Tags:multi-agent system, model-based reinforcement learning, generalization, curse of dimensionality, gaussian Regression
PDF Full Text Request
Related items