Font Size: a A A

The Study Of Robotic Arm Control Policy Based On DQN

Posted on:2019-05-12Degree:MasterType:Thesis
Country:ChinaCandidate:X GuoFull Text:PDF
GTID:2348330542987611Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In recent years,deep reinforcement learning has been hotly researched in the field of artificial intelligence,and its application to robot intelligent control is a future research trend.The algorithm based on deep reinforcement learning has made a breakthrough in various fields,especially in the field of robotics control.And the pioneering work in deep reinforcement learning field is DQN(Deep Q-network),which combine the convolution neural network with the traditional Q-learning algorithm,it solve the problem that the traditional robot can not perceive the environment when making decision.Thus the key point in this dissertation is to research how to make use of DQN and its improved model in real environment to allow the manipulator to learn the optimal policy.The goal of this research is to gain a strategic network through training which allow the robotic arm to make actions according to the raw sensory data which is high dimensional,and to achieve the direct control of raw input to output with the end-to-end learning method.The main research contents and contributions of this dissertation are as follow:Firstly,the robotics arm strategy control algorithm based on the guided DQN was proposed and researched in this dissertation.The main principle of this algorithm is to adopt bootstrap method and use multiple shunt networks to randomize the value function,thus expand the exploration scope of state space temporarily in order to achieve deep exploration.This method of distributed deep exploration can fully guarantee the agent's exploration of different strategies,produce a variety of samples and make the dynamic information of the environment generalized to the state space of the position in a better way.Secondly,the robotics arm strategy control algorithm based on the recurrent DQN was proposed in the dissertation.Real-world tasks often feature incomplete and noisy state information,resulting from partial observability.We modify DQN to handle the noisy observations characteristic of POMDPs by combining a recurrent nerual network with the a Deep Q-Network.By adding new function model-neural network structure to the DQN network,the model has the ability to remember on the timeline as a result and handle the loss of information.The object-grabbing task of robotics arm in real environment was completed with DQN and its improved model.A robotics arm security mechanism was proposed in this paper as well in order to guarantee the continuity of the training process,at the same time,avoid security problems such as the collisions of robotics arm with themselves or with external objects.It was demonstrated the effectiveness of the two algorithms designed in the dissertation according to the experimental results.
Keywords/Search Tags:Robot, Robtics Arm, Control Policy, Deep Reinforcement Learning, DQN, LSTM
PDF Full Text Request
Related items