Font Size: a A A

Study Of Q-learning Algorithm In Multi-agent System

Posted on:2013-04-27Degree:MasterType:Thesis
Country:ChinaCandidate:L QiaoFull Text:PDF
GTID:2248330377455225Subject:Pattern Recognition and Intelligent Systems
Abstract/Summary:PDF Full Text Request
With the development and progress of artificial intelligence technology and computer science,distributed artificial intelligence and other technologies have emerged and got rapid development inthe past twenty years. Multi-Agent System (MAS) is an important branch of distributed artificialintelligence. The major research of MAS technology is the control of complex behaviors or tasksolving through interaction, cooperation, competition, consultation and other intelligent behaviors.Reinforcement learning is an effective tool to solve the optimization problem in agent system.Q-learning algorithm is the algorithm which is the most widely researched and used inreinforcement learning.In this paper, the studying of learning efficiency, the complexity of state space and the abilityto adapt to complex environments of Q-learning algorithm in MAS, The classic artificialintelligence problem of rounding up is used as a simulation environment. Some improved methodsare proposed. Specific research works are as follows:First, to solve the problem of repeating learning and low learning efficiency, this paperproposes a Q-learning algorithm of sharing experiences in the process of learning. Q-learningalgorithm is an online learning which is unsupervised. The priori knowledge on the environment isnot required. And because of this, agents need to spend some time learning about this prioriknowledge. One characteristic of multi-agent Q-learning is all the learning process is combined,including: united action, united states and united rewards or punishments. These will increaselearning search volume and state space dimension. To solve these problems this paper proposes aQ-learning algorithm of sharing experiences in the process of learning. This algorithm is not basedon united action, united states and united rewards. All agents learn separately and share experiencesin stages. This algorithm simulates the human team learning methods, Agents not only have commongoals, but also have their own tasks to be accomplished. Agents regularly share experience. Fromthe simulation experiment, we can see a better learning outcomes, the learning efficiency isobviously superior to standard Q learning algorithm.Second, against complex learning environments, this paper proposes a Q learning algorithmbased on multi-standard of rewards. The real application environment is more complex than theenvironment of theory. Multi-agent Q-learning algorithm must be able to adapt to this complexlearning environment. The approach to solve this problem is splitting the complex learning environment, each small learning environment has its own standards of rewards, and own goal to besolved in this small learning environment. This allows for the design of the standard of rewardswhich is fit for the environment, so that it can get the stage goal efficiently. Simulation results showthat Q learning algorithm based on multi-standard of rewards can efficiently complete learningtasks and flexibly adapt to different environments and state.
Keywords/Search Tags:MAS, Q learning algorithm, sharing experience, multi-standard of rewards, pursuit problem
PDF Full Text Request
Related items