Font Size: a A A

A Research On Robot Autonomous Path Planning In Indoor Dynamic Scenarios

Posted on:2022-03-06Degree:MasterType:Thesis
Country:ChinaCandidate:T GanFull Text:PDF
GTID:2518306524989709Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
The autonomous decoration robot needs to move back and forth in the decoration environment when performing various tasks.The ability of autonomous path planning is an important measure of the intelligence of mobile robots.This requires mobile robots to actively perceive environmental information,and then use the environmental information to make navigation decisions.In a dynamic environment,the mobile robot's cognition of the global information of the environment is insufficient,and it uses sensors to actively perceive the local information of the environment to plan its own path.Therefore,the ability to understand and use environmental information is the key to autonomous path planning for mobile robots.The main work of thesis is as follows:1.ROS and Gazebo were used to create the mobile robot simulation model and set up the simulation experiment environment.The mobile robot obtains local environmental information and its own state information through its own lidar sensor and speed odometer sensor.In thesis,a fully connected neural network is used to process lidar information and robot state information.In thesis,the proximal policy optimization(PPO)deep reinforcement learning algorithm combining value function and strategy function is used to train mobile robots to learn and master the ability of path planning.2.In order to better the deep reinforcement learning is applied to mobile robot path planning tasks,thesis designs a set of path planning for mobile robot task the potential energy function reward,the reward of the potential energy function,on the basis of the optimal strategy will not change to join the mobile robot navigation task prior knowledge,so as to accelerate the learning process of mobile robots.At the same time,in order to solve the long sequence in the decision-making process of deep reinforcement learning reward credit allocation of the problems of delay,thesis designed a kind of credit allocation model based on scenario round,will be reward at the end of each round again assigned to track sequence before action,thereby decrease the total return variance to make training more smoothly,improve the utilization efficiency of the sample at the same time,Accelerate the learning process.3.In order to improve the ability of mobile robot to adapt to dynamic environment, correspondence relationship in dynamic environment is modeled as a distribution,and an auxiliary task model to predict distribution is designed.The auxiliary task model uses two neural networks to output the reward mean and variance of the above distribution,and then conducts regression training with the real reward value.For the auxiliary task model,thesis uses an adaptive regression loss function to train the auxiliary task network by minimizing the regression loss function.The feature information of hidden layer is shared between deep reinforcement learning model and auxiliary task model through hard coding.By sharing feature information,when the mobile robot learns the path planning task,the auxiliary task can help it better understand the dynamic environment,thus improving the ability of the mobile robot to adapt to the dynamic environment.
Keywords/Search Tags:path planning, deep reinforcement learning, reward shaping, credit distribution, auxiliary tasks
PDF Full Text Request
Related items