Font Size: a A A

Reinforcement Learning Based Research On Robot Foraging Problem

Posted on:2014-06-17Degree:MasterType:Thesis
Country:ChinaCandidate:J J LiFull Text:PDF
GTID:2268330392969136Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
The ability of a robot to adapt to the environment determines its intelligentdegree. Especially in the complex unstructured environment, the adaptability ofrobot is a critical issue, which requires the robot to have the ability to learn fromenvironment. With few pre-conditions, reinforcement learning can provide robotswith approaches to learn behaviors and help them have excellent adaptability. As weknow, foraging behavior is broadly representative and has important applicationvalue. Therefore, we conduct a series of research on foraging behavior by usingreinforcement learning in this thesis.The key problems of reinforcement learning are convergences and convergentrates, which determine whether the robot can learn to forage sucessfully and itslearning speed. This thesis first proposes a method that foraging behavior isdecomposed into blocks, each of which is integrated by elementary behaviors, so asto reduce learning space greatly. Then standard Markov Decision Processes (MDPs)model is established. On this basis, we add a little priori artificial knowledgerationally to accelerate the learning process. In addition, a simulation experimentthat single robot learns foraging with Q-learning algorithm is conducted, and theresult shows that method of decomposing task and priori knowledge improve onlinelearning speed obviously.Moreover, compared with the single robot system, multi-robot system hasconcurrency, robustness and other merits. Thus, this thesis discusses the idea thatmulti-robots forage cooperatively by using average-reward reinforcement learningalgorithm. We put forward a relative value iteration (RVI) reinforcement learningbased on Schweitzer’s transformation. Then, the MDPs model of multi-robotforaging is established similarly to single robot, and new RVI algorithm is applied tomulti-robot foraging. Finally, we compare the new algorithm with Q-learning in asimulation experiment and find that new RVI algorithm is effective and has highreliability.
Keywords/Search Tags:robot, forage, reinforcement learning, MDP, relative value iteration
PDF Full Text Request
Related items