Reinforcement Learning Based Research On Robot Foraging Problem

Posted on:2014-06-17

Degree:Master

Type:Thesis

Country:China

Candidate:J J Li

Full Text:PDF

GTID:2268330392969136

Subject:Control Science and Engineering

Abstract/Summary:

PDF Full Text Request

The ability of a robot to adapt to the environment determines its intelligentdegree. Especially in the complex unstructured environment, the adaptability ofrobot is a critical issue, which requires the robot to have the ability to learn fromenvironment. With few pre-conditions, reinforcement learning can provide robotswith approaches to learn behaviors and help them have excellent adaptability. As weknow, foraging behavior is broadly representative and has important applicationvalue. Therefore, we conduct a series of research on foraging behavior by usingreinforcement learning in this thesis.The key problems of reinforcement learning are convergences and convergentrates, which determine whether the robot can learn to forage sucessfully and itslearning speed. This thesis first proposes a method that foraging behavior isdecomposed into blocks, each of which is integrated by elementary behaviors, so asto reduce learning space greatly. Then standard Markov Decision Processes (MDPs)model is established. On this basis, we add a little priori artificial knowledgerationally to accelerate the learning process. In addition, a simulation experimentthat single robot learns foraging with Q-learning algorithm is conducted, and theresult shows that method of decomposing task and priori knowledge improve onlinelearning speed obviously.Moreover, compared with the single robot system, multi-robot system hasconcurrency, robustness and other merits. Thus, this thesis discusses the idea thatmulti-robots forage cooperatively by using average-reward reinforcement learningalgorithm. We put forward a relative value iteration (RVI) reinforcement learningbased on Schweitzer’s transformation. Then, the MDPs model of multi-robotforaging is established similarly to single robot, and new RVI algorithm is applied tomulti-robot foraging. Finally, we compare the new algorithm with Q-learning in asimulation experiment and find that new RVI algorithm is effective and has highreliability.

Keywords/Search Tags:

robot, forage, reinforcement learning, MDP, relative value iteration

PDF Full Text Request

Related items

1	Navigation Control For Autonomous Mobile Robots Based On Reinforcement Learning
2	Research On Motion Control Of Mobile Robots Based On Reinforcement Learning
3	Research On Policy Iteration Algorithm Within Bayesian Reinforcement Learning
4	Research On Intensive Learning Based On Motivation And Its Application
5	Sampled-data Iterative Learning Control For Continuous-time Nonlinear Systems With Iteration Varying Lengths
6	Research On Application Of Reinforcement Learning In Swing-up And Balance Control Of Inverted Penduum
7	Research On Reinforcement Learning Methods Based On Weighted Double Mechanisms
8	Policy Iteration Reinforcement Learning Based On Geodesic Gaussian Kernel
9	Efficient approximate policy iteration methods for sequential decision making in reinforcement learning
10	Research On Reinforcement Learning Methods For Navigation And Control Of Autonoumous Mobile Robots