Research On The Manipulator Trajectory Planning Based On Q-learning

Posted on:2014-01-04

Degree:Master

Type:Thesis

Country:China

Candidate:H Zhao

Full Text:PDF

GTID:2248330398995128

Subject:Pattern Recognition and Intelligent Systems

Abstract/Summary:

PDF Full Text Request

As the important branch of robotics research, manipulator has been widespread noticed.And the machine learning method applied to the control of manipulator trajectory planninghas been popularly research directions in the field of artificial intelligence. Q-learning is anunsupervised and model-free algorithm which belongs to online learning technology, too. Inorder to achieve the purpose of learning and optimization, the algorithm interacts with theenvironment via trial and error way.In this paper, the control problem of manipulator trajectory planning is studied. TheQ-learning has advantage when it is applied to the manipulator trajectory planning on thebasis of analysis of existing manipulator control method and the Q-learning algorithm. First ofall, in order to obtain the calculation method of the coordinate transformation in the course ofthe campaign, we analyze the structure of the actual system, and abstract the manipulator asan articulate manipulator system which works in two-dimensional space. To the relation ofcoupling between adjacent mechanical linkage, we convert manipulator control to theproblem of multi-agent cooperative learning.An example for single-agent path planning is used to explain the theory of Q-learningalgorithm and the highlight characteristics of learning and optimization. For a multi-linkmanipulator system, in order to solve the state coupling between adjacent linkage, we proposea specific solutions and the scalability is discussed.In order to solve the problem of local optimal solution in Îµ-greedy, we obtain the reasonvia analyzing theory of the Îµ-greedy strategy searching for the optimal solution. we improvethe strategy which can dynamically adjust parameters of greedy strategy according to learningprocess. The Q-learning algorithm can escape from the local optimal solution status by usingthe dynamic greedy strategy until reaching the global optimum.In order to solve the problem of evaluation of motion effect in Q-learning algorithm, wedesign method to judge the effectiveness of the implementation of an evaluation action whichis used for manipulator trajectory planning. The method based on the Euclidean distancebetween current position and target point, to give quantitatively incentive according to themotion effects. The method not only overcomes the shortcomings of only the "good" or "bad"indicators in traditional evaluation mechanisms, but also ensures the objectivity and fairness. An intelligent controller for2-DOF manipulator is designed in this paper. To compareand analyze the superiority of improving Q-learning algorithm in trajectory planning. Finally,the controller is extended to3-DOF manipulator trajectory planning. The simulation resultsshow that the controller has expansibility and feasibility.

Keywords/Search Tags:

Manipulator controller, Q-learning algorithm, Trajectory planning, Greedystrategy, Local optima solution, Quantitative evaluation

PDF Full Text Request

Related items

1	Researches On Motion Planning Of Mobile Manipulator
2	Study On Trajectory Planning And Trajectory Tracking Of Tool Changing Manipulator Of Shield Machine
3	Inverse Kinematics Solution And Trajectory Planning Of 6-DOF Manipulator
4	The Trajectory Planning And Control Of 3-dof Parallel Manipulator With Limbs Of Embedding Structures
5	Research On Manipulator Trajectory Planning Based On Spark Platform And Multi-Objective Artificial Bee Colony Algorithm
6	Research On Trajectory Planning Of Six-axis Joint Manipulator Based On Time Optimization
7	The Research On Manipulator Trajectory Planning
8	Six Degrees Of Freedom Manipulator Trajectory Planning Studies
9	Research On Trajectory Planning And Optimization Of 6-DOF Manipulator
10	Research On Trajectory Planning Of 6-DOF Manipulator Used For Stream Generator