Font Size: a A A

Research On Reinforcement Learning Method And Its Application Technology

Posted on:2013-01-11Degree:MasterType:Thesis
Country:ChinaCandidate:H LiFull Text:PDF
GTID:2248330395455520Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Reinforcement learning is an important machine learning method which isextensively used in intelligent control and analysis prediction fields. The learnerimproves its own behavior through trial and error with environment. Thus the methodsuits well when the learner is leaning control strategies in a problem domain it has littleknowledge of. Multi-agent reinforcement learning improves the traditionalreinforcement learning by making multiple agents learn together so that the method ismore suitable in an open, complex and dynamic environment.First, for single agent reinforcement learning algorithm, the reinforcement learningusing heuristic selection of actions is studied and improved. After each learning episode,the state transition process in the episode is analyzed using state back method. Then thisinformation is used to guide the agent’s next action selection to accelerate the learningprocess.Second, for centralized multi-agent reinforcement learning, a decompositionstrategy is used to decompose the general task into several subtasks, which is distributedto each independent agent. During the learning process, each agent learns from theexperience of other agents, and shares its own experience with other agents at the sametime. So the agent can reinforce the good behaviors through experience summarymethod to speed up the convergence of the learning process. The results ofmulti-objective pursuit game prove the effectiveness of the method.Finally, for the multi-agent cooperative reinforcement learning with joint actions,each agent first establish its own co-tree to select partner, then combines the methods ofteam markov games and Q-learning together to influence the selection of the jointaction strategy, so that the joint actions of all the cooperative agents can converge to theglobal optinum. Experimental of red versus blue verifies the feasibility of this method.
Keywords/Search Tags:Reinforcement Learning, Q-learning, Multi-agent, Markov Games
PDF Full Text Request
Related items