Research On Path Planning Of Warehouse Handling Robot Based On Deep Reinforcement Learning

Posted on:2022-08-10

Degree:Master

Type:Thesis

Country:China

Candidate:C T Rong

Full Text:PDF

GTID:2481306329952779

Subject:Control Science and Engineering

Abstract/Summary:

PDF Full Text Request

With the continuous expansion of oilfield development and production scale,the task of ensuring the supply of materials has further increased,and the research on oilfield storage and handling robots has attracted more and more attention from researchers.The task of the oilfield storage and handling robot is to replace the workers to issue and store downhole tools in the storage room.Since the environment of the storage room is constantly changing with the issuance and storage of tools,the robot needs to effectively avoid dynamic obstacles and static obstacles.Therefore,the robot is required to effectively perform path planning tasks in an unknown dynamic environment.The Deep Reinforcement Learning(DRL)algorithm does not rely on any prior knowledge to select the optimal action through interaction with the environment.In this paper,the DRL algorithm is combined with the path planning task of the oilfield storage and handling robot,and the effectiveness of the DRL algorithm is verified in the Gazebo simulation environment.The main research contents are as follows:First of all,in view of the defect of the deep deterministic policy gradient(DDPG)algorithm that is prone to local optimality,this paper introduces a pair of critic networks and selects the minimum of the Q values generated in the two critic networks as the target value for updating the actor network parameters.The overestimation bias generated during network training is reduced,and the optimal strategy is solved.Secondly,in view of the slow convergence speed of the DDPG algorithm,this paper improves the priority of samples in the experience playback mechanism.The sum of the two TD-errors generated by the introduced two critic networks and the immediate reward of the sample is used as the priority of the sample.The immediate reward value of the sample and the absolute value of TD-error have a linear relationship with the sample importance.This improved method can fully take into account the influence of the two on the sample importance when sampling,in order to achieve the purpose of accelerating the algorithm convergence speed.The method proposed in this paper is tested in multiple experimental environments and multiple comparison algorithms on the Open AI Gym platform to verify the effectiveness of the improved algorithm.Finally,the improved method proposed in this paper is used in the application of oilfield storage and handling robot path planning tasks to verify the effectiveness of the algorithm.First,the robot path planning task is described,and the solution is proposed,including the setting of the state space and action space of path planning.Second,the robot model is established,each node of the robot is built,the path planning environment model is established,the dynamic and static obstacles are set,the network model of the algorithm and the reward function are designed.Third,the improved algorithm is combined with path planning tasks to conduct simulation experiments,and the results of the comparison experiments are analyzed to verify the effectiveness of the improved algorithm for path planning tasks in an unknown dynamic environment.

Keywords/Search Tags:

Deep Reinforcement Learning, Gazebo, Mobile Robot, Path Planning, ROS

PDF Full Text Request

Related items

1	Virtual Assembly Path Based On Deep Reinforcement Learning Research On Planning Method
2	Research On Path Planning And Tracking Method Of Mobile Robot For Oilfield Inspection
3	Research On Bionic Path Planning Algorithm For Mobile Welding Robot
4	Research On Low NO_x Emission Path Planning For Diesel Freight Cars
5	Research On Task Assignment And Path Planning Of Multi-mobile Sandblasting Robot
6	Research On Trajectory Planning And Positioning Method Of Mobile Sandblasting Robot
7	Research On Path Planning Method Of Crawler Robot In Coal Mine
8	Research On Crowd Evacuation Method Of Based On Policy Optimization Deep Reinforcement Learning
9	Research On Structure Design And Motion Planning Of A Mobile Welding Robot With Linear Module
10	Research On Path Planning Algorithm Of Unmanned Vehicle In Coal Mine Based On Reinforcement Learning