Font Size: a A A

Research On The Manipulator Trajectory Planning Based On Q-learning

Posted on:2014-01-04Degree:MasterType:Thesis
Country:ChinaCandidate:H ZhaoFull Text:PDF
GTID:2248330398995128Subject:Pattern Recognition and Intelligent Systems
Abstract/Summary:PDF Full Text Request
As the important branch of robotics research, manipulator has been widespread noticed.And the machine learning method applied to the control of manipulator trajectory planninghas been popularly research directions in the field of artificial intelligence. Q-learning is anunsupervised and model-free algorithm which belongs to online learning technology, too. Inorder to achieve the purpose of learning and optimization, the algorithm interacts with theenvironment via trial and error way.In this paper, the control problem of manipulator trajectory planning is studied. TheQ-learning has advantage when it is applied to the manipulator trajectory planning on thebasis of analysis of existing manipulator control method and the Q-learning algorithm. First ofall, in order to obtain the calculation method of the coordinate transformation in the course ofthe campaign, we analyze the structure of the actual system, and abstract the manipulator asan articulate manipulator system which works in two-dimensional space. To the relation ofcoupling between adjacent mechanical linkage, we convert manipulator control to theproblem of multi-agent cooperative learning.An example for single-agent path planning is used to explain the theory of Q-learningalgorithm and the highlight characteristics of learning and optimization. For a multi-linkmanipulator system, in order to solve the state coupling between adjacent linkage, we proposea specific solutions and the scalability is discussed.In order to solve the problem of local optimal solution in ε-greedy, we obtain the reasonvia analyzing theory of the ε-greedy strategy searching for the optimal solution. we improvethe strategy which can dynamically adjust parameters of greedy strategy according to learningprocess. The Q-learning algorithm can escape from the local optimal solution status by usingthe dynamic greedy strategy until reaching the global optimum.In order to solve the problem of evaluation of motion effect in Q-learning algorithm, wedesign method to judge the effectiveness of the implementation of an evaluation action whichis used for manipulator trajectory planning. The method based on the Euclidean distancebetween current position and target point, to give quantitatively incentive according to themotion effects. The method not only overcomes the shortcomings of only the "good" or "bad"indicators in traditional evaluation mechanisms, but also ensures the objectivity and fairness. An intelligent controller for2-DOF manipulator is designed in this paper. To compareand analyze the superiority of improving Q-learning algorithm in trajectory planning. Finally,the controller is extended to3-DOF manipulator trajectory planning. The simulation resultsshow that the controller has expansibility and feasibility.
Keywords/Search Tags:Manipulator controller, Q-learning algorithm, Trajectory planning, Greedystrategy, Local optima solution, Quantitative evaluation
PDF Full Text Request
Related items