Font Size: a A A

Study Of Multi-agent Foraging Based On CE-Q Reinforcement Learning And K-means Clustering Integrated Algorithm

Posted on:2015-02-06Degree:MasterType:Thesis
Country:ChinaCandidate:M H LeiFull Text:PDF
GTID:2268330428481451Subject:Mechanical Manufacturing and Automation
Abstract/Summary:PDF Full Text Request
The Multi-agent system has great advantages of adaptability, economy, robustness, flexibility and fault tolerance comparing to single agent,hence is suitable to replace human labor for actual production and even military uses under harsh, dangerous, unhealthful environment.If we want the multi-Agent system to play a great role in practical application, a control method of machine learning that can help the multi-Agent system to adapt the environment is necessary. As a widely concerned machine learning method, reinforcement learning can help the multi-Agent system to have online self-learning ability, hence is widely used in the field of behavior learning of multi-Agent system. Current researches about reinforcement learning are mainly focused on the convergence and the consideration to both overall and individual interests, and problems in practical applications such as the lack of continuous learning due to high computational complexity and dimensional curse,and the efficiency of system.Studies are conducted according to the problems above.Q reinforcement learning algorithm to Multi-agent system is applied in this thesis, then a brief introduction is made about reinforcement learning algorithms that are based on game theory, the advantages and disadvantages of the algorithms is analyzed in practical applications and choose the CE-Q algorithm that ensures convergence and rationality, has relatively low computing complexity, but faces the curse of dimensionality in actual tasks.Aiming at avoiding the curse of dimensionality, environmental features are clustered by using K-Means algorithm in order to avoid the curse of dimensionality in the process of reinforcement learning by mapping of "kind of the state of environment-strategy"A method to instantly reward the action process of each agent is proposed, instant reward function of action process is added to the traditional reward function that was just for the results. In the process of accomplishing a complicated task, each Agent of the Multi-agent system needs to finish a series of actions. The instant reward for action process can instantly reward every action, so that the agents can make more fully use of their own experiences generated in the process, greatly reduces the probability of rewards earned by wrong actions and actions of poor efficiency.The last part is to verify the improved CE-Q and K-means combined algorithm. The simulation platform is put up by using Matlab and Multi-agent tool box, Multi-agent foraging that has wide application background is used as the simulation task. Three typical reinforcement learning algorithm were carried out in the experiment and were compared together. The experimental results in the end showed the availability and the superiority of the improved CE-Q in practical application.The improved CE-Q and K-means combined algorithm put up in thesis can ensure the convergence and rationality, and has the advantages of low calculation complexity, high coorperation leaning speed and high efficiency of the system. The ’curse of dimensionality’of the algorithm is also alleviated. Hence good practical value of the algorithm can be expected.
Keywords/Search Tags:Multi-agent, CE-Q reinforcement learning algorithm, instant rewardfor action process, K-Means clustering algorithm, foraging
PDF Full Text Request
Related items