Font Size: a A A

Research On Trajectory And Path Planning Method Of Mobile Manipulator Based On Reinforcement Learning

Posted on:2022-03-02Degree:MasterType:Thesis
Country:ChinaCandidate:X F HuangFull Text:PDF
GTID:2518306524994199Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
In the field of industrial robots,the effect of the trajectory and path collaborative planning of the mobile manipulator directly affects the efficiency of the automatic picking of items by the mobile manipulator in the storage environment.In the past academic research,the trajectory planning and path planning of the mobile manipulator were often considered as independent issues.This method often broke the connection between the two and to a certain extent,reduced the efficiency of picking items of the mobile manipulator.In response to the need to establish a model of trajectory and path collaborative planning of the mobile manipulator in an unknown environment,this thesis designs and implements a deep reinforcement learning algorithm for model training to connect the trajectory planning and path planning issues of the mobile manipulator so that the mobile manipulator picks items more efficiently in a storage environment.The main work of this thesis includes:(1)An experimental simulation environment has been built,and the corresponding functions have been improved.Refer to the storage environment,we used Unity3 D engine to build an experimental simulation environment,used the ML-Agents framework to create a mobile manipulator agent,and completed the design and specification of the state space,action space and reward function for the agent.The parameter information of the agent is used to train the deep reinforcement learning algorithm based on the experimental environment.(2)In view of the experimental conditions of several basic deep reinforcement learning algorithms,a benchmark algorithm is selected.After fine-tuning the internal parameters of the basic algorithms,a loader of the experimental simulation environment was built to provide the environmental information of the mobile manipulator agent for the corresponding basic algorithms,and finally used these algorithms to train the model of trajectory and path collaborative planning of the mobile manipulator,compared the loss value of the value network and the average value of the reward value in these models,and selected TD3 algorithm with better performance as the benchmark algorithm.(3)Improved the benchmark algorithm and conducted experiments.The double delay mechanism of the TD3 algorithm can stabilize the process of the model parameter updates.And the asynchronous advantage of the A3 C algorithm can further strengthen the irrelevance of empirical data and improve the training speed of the model.Therefore,we proposed a deep deterministic policy gradient algorithm that combines asynchronous advantage and double delay,namely the AA-TD3 algorithm,designed and implemented the overall architecture and neural network structure of the algorithm,and compared the experimental results of the model before and after the improvement.This thesis applied the deep reinforcement learning algorithm to the training of the model of trajectory and path collaborative planning of the mobile manipulator for the first time,and then optimized the internal structure of the benchmark algorithm.It included both theoretical and application innovations and provided a practical foundation for the application and development of reinforcement learning in the field of industrial robots and a certain reference value for subsequent related scientific research work.
Keywords/Search Tags:Mobile Manipulator, Deep Reinforcement Learning, ML-Agents, Trajectory and Path Collaborative Planning, TD3, AA-TD3
PDF Full Text Request
Related items