Multi-Agent Collaborative Research In RoboCup2D Simulation Soccer Team

Posted on:2014-01-06

Degree:Master

Type:Thesis

Country:China

Candidate:H Zhou

Full Text:PDF

GTID:2248330395484077

Subject:Instrumentation engineering

Abstract/Summary:

PDF Full Text Request

Artificial intelligence,as an independent subject,came out in1950s.Identified as a field incomplex systems, it has widely attracted many scholars at home and abroad in recent years.Decision theory is an important research branch of artificial intelligence, which has become a coreissue in the selection and coordination of robotic behavior. It is significantly important for human toresearching decision theory, thus we can have better control of machines and let them play moreimportant role in serving humanity. This paper is based on RoboCup2D simulation platform,aimingat enhancing the simulation football team’s offensive and defensive ability.With the good use ofmodeling decision-making of multi-agent and the agent reinforcement learning method, it deeplydiscusses the problem of multi-agent coordination.This paper is divided into four sections. The first section is focusing on summarizing theprecious results of previous studies, and elaborating the critical knowledge, such as perceptiveinformation of the agents (seeing, hearing, feeling) and action command (kick, shoot, turnneck, etc).Then a research of agents positioning is carried on, centering on the different roles they played.Secondly,it proposes mechanism of the online action sequences searching based on tree searchalgorithm.It evaluates the tree search algorithm, and puts forward the concept of the action sequence.Then it has a brief review of the reinforcement learning and Q-Learning, and proposes experienceaccumulated algorithm as well as the player hot-zone concept, thus constructing a learning agent. Inthe course of the game, agents will update the E matrix if the action it has taken brings him apositive return value. When it comes across a similar situation, it will have a greater probabilityselection of the same action.The former two studies are both focusing on improving the team’s offensive capability andoverall combat capability.The final study is for the defense. Goalkeeper is a unique player oncourt.Specializing the action particularity of goalkeeper, the paper uses POMDP to model thedecision of goalkeeper, and uses value iteration algorithm solving POMDP model. The results, withthe form of a sequence of actions,is actually a set of actions.Every action has maximum returnvalue.The algorithm and mechanism that proposed in this paper are all examined on the RoboCup2Dsimulation platform.

Keywords/Search Tags:

MAS, Tree Search Algorithm, Action Sequence, Experience Cumulation, POMDP

PDF Full Text Request

Related items

1	The Research And Design Of Approximation Methods For POMDP
2	Research On First-order Decision-theoretic Planning In Relational Uncertain Environments
3	Improving dynamic decision making through RFID: A partially observable Markov decision process (POMDP) for RFID-enhanced warehouse search operations
4	Study On The Defensive Action And Movement Of Simulation 2D Soccer System
5	Research On The Evolutionary Search Algorithms In The WEB Based On Learning
6	Research And Implementation Of Video Action Search System Based On Temporal Action Detection
7	Research On Point-based Value Iteration Algorithms In POMDP Domains
8	Research On Fast Motion Estimation Algorithms Based On Search Experience
9	Human Action Recognition Based On Depth Sequence
10	Research Of Sequential Pattern Algorithm Over Data Streams Based On Prefix Sequence Tree