Font Size: a A A

Adaptive Agent Based On Model-free Reinforcement Learning Collaborative Planning

Posted on:2020-05-08Degree:MasterType:Thesis
Country:ChinaCandidate:Q WanFull Text:PDF
GTID:2428330605969364Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Multi-agent Collaboration System(MACS)is designed to enable agents to respond to the changes of the environment in real time,and effectively organize agents to achieve the adaptive goals together.Among them,Agent planning and cooperation are two important aspects of Agent research.The research methods mainly include logical reasoning based on known environment and machine learning.Although the logic reasoning based on the known environment considers the dynamic programming problem of the changing environment,the planning and cooperation problem under the unknown environment cannot be solved.Although the method based on machine learning can be used to plan and cooperate in an unknown environment,its decision-making efficiency is not as high as that based on logical reasoning.On the basis of Jason and Ja Ca Mo,an adaptive Agent collaborative planning method based on model-free reinforcement learning is proposed in this paper.Aiming at the problem of strategic planning in the unknown and dynamic environment of ASL,the Q-learning algorithm based on reinforcement learning is proposed to realize the learning and planning of Agent in ASL model.Then we improved ASL decision optimal planning method based on the Q-learning in view of the Ja Ca Mo about the role of task in optimal allocation problem,based on the broadcast mechanism of role assignment optimal algorithm,and finally applies the improved model of Jason and Ja Ca Mo for modeling RCRSS simulation scene,this paper proposed method is proved feasible and effective by compared the original model and improved model on the RCRSS.The research of this paper has the following characteristics:First,the ASL model is integrated with rule description,logical reasoning and reinforcement learning.Aiming at the problem that ASL could not make a decision in the unknown environment,q-learning algorithm was adopted in this paper to dynamically generate the optimal action sequence based on the target,and rules that could be used for reasoning were created,so that the Agent planning could not only adapt to the change of the unknown environment,but also perform tasks with high efficiency.Second,in Ja Ca Mo model,in view of the task assignment problem with role inside the Agent,is put forward in the ASL has been learning to reward value,on the basis of the same character class Agent for broadcasting,because each Agent have other Agent reward value,so the Agent can choose to perform a task at a time of optimal way,the execution efficiency of the whole system could be greatly improved.
Keywords/Search Tags:Agent Cooperation, JaCaMo, Q-learning, Task Assignment, Role
PDF Full Text Request
Related items