Font Size: a A A

Simulation For Manipulator Trajectory Planning Based On Deep Reinforcement Learning

Posted on:2021-04-11Degree:MasterType:Thesis
Country:ChinaCandidate:J T ZhengFull Text:PDF
GTID:2428330623967884Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Traditional robot control algorithms require developers to be familiar with the relevant kinematic characteristics.Its programming complexity is relatively high but reusability does not like that.As the industrial chain is fully transformed into automation and intelligence,especially thanks to the continuous innovation of research in the field of deep learning in recent years,the control algorithm of the robot has also ushered in a wave of reshuffle.The algorithms based on deep learning and machine learning is gradually becoming a research hotspot in this field.While the manipulator trajectory planning is also a classic problem in this field.This paper has carried on research to manipulator trajectory planning and the main work is as follows:In view of the problem that the existing deep reinforcement learning algorithm that is not specially developed for the manipulator is not ideal when it is directly applied to the manipulator trajectory planning,a method called dynamic step and partition reward design based on discrete space is proposed.This method adjusts the structure of the original network,especially optimized the motion output and reward function design.By dynamically adjusting the movement speed of the manipulator,the balance between accuracy and speed is achieved,and the sparse rewards problem faced in reinforcement learning is improved.In a cross-validation comparison experiment that limits the number of decisions per episode,this method's performance beyond the original network and the average success rate on the test set increased from 43% to 56%.In view of the problem that it is difficult to learn multi-step decision-making based solely on cumulative rewards in the trajectory planning of the robotic arm,a trajectory planning method for the robotic arm based on augmented learning of is proposed.By providing a small amount of demonstration information,the algorithm is used to automatically expand demonstration,and stored in the memory library for the robotic arm to imitate learning,which effectively reduces the difficulty of the initial training and obtains better performance.Compared with the traditional supervised learning method based on the same demonstration,in a cross-validation comparison experiment that limits the number of decisions per episode,the average success rate increased from 74% to 81%.Compared with inverse reinforcement learning,the two have similar average success rates,but the method proposed is easier to be trained and does not require additional neural network.Based on Gazebo and ROS,a set of manipulator simulation platform was built for a Kent6 V2 six-axis manipulator.It can be easily transplanted across platforms through Docker images,helping different researchers to conduct manipulator related algorithm.It also explored the generalization ability of the aforementioned algorithm model.The Kalman filter and Gaussian white noise were used to improve the generalization ability of the model.Experiments were conducted in various simulation scenarios,and the results show that the model trained by the algorithm proposed in this paper has a certain generalization ability in a limited simulation environment.
Keywords/Search Tags:Manipulator trajectory planning, Deep reinforcement learning, Imitation learning, Simulation platform
PDF Full Text Request
Related items