Font Size: a A A

Research On Manipulation Behavior Of Interactive Robot Based On Deep Reinforcement Learning

Posted on:2022-05-10Degree:MasterType:Thesis
Country:ChinaCandidate:J P ChenFull Text:PDF
GTID:2518306563479484Subject:Mechanical and electrical engineering
Abstract/Summary:PDF Full Text Request
Robot manipulation behavior is one of the important ways for the robot to interact with the world.It has gradually become a research hotspot in the field of robotics and is of great significance to the development of robotics.Grasping behavior is the main way for the robot to manipulate objects.However,when faced with a complex environment with tightly arranged objects,it is difficult for the robot to efficiently complete the manipulation task only through the grasping behavior,and the pushing behavior is needed.Pushing behavior can create space for grasping behavior,but the synergistic strategy of the grasping and pushing behaviors has the problems of overestimating behavior values,low sampling efficiency,and lacking an effective behavior evaluation system.In order to make the synergistic strategy of grasping and pushing behaviors better generalize to complex environments with tightly arranged objects,this thesis proposes a manipulation synergistic strategy based on deep reinforcement learning.This strategy makes the robot adopt different behaviors in different environments,suppresses the overestimation of behavior values,improves the sampling efficiency and the generalization ability to new environments.The main contents and innovations of this thesis are as follows:(1)The Double DQN algorithm is used to decouple the calculation of manipulation behavior values and the choice of behavior manner.The manipulation synergistic strategy first selects the maximum behavior value index of the robot at the next state with the estimating network,and then selects the behavior value corresponding to the target network according to the index.However,the behavior value output by the target network is not necessarily the maximum behavior value corresponding to the behavior index.This avoids overestimation to a certain extent,suppresses the model overfitting,and improves the stability and generalization of the model.(2)A manipulation behavior evaluation system is proposed to measure the quality of manipulation behaviors.The reward function evaluates the quality of behavior according to the criteria of the behavior evaluation system,and then feeds back to the synergistic strategy network for back propagation.For the grasping behavior,if the gripper is unclosed at the end of the grasping behavior,the grasping behavior is successful;for the pushing behavior,the positions and number of the objects are identified through target detection,and the object separation degree is designed according to the Euclidean distance between the objects.The object separation degree is used to measure the change of the environment before and after the pushing behavior,and evaluate the quality of the pushing behavior.The manipulation behavior evaluation system enables the reward function to more accurately feed back the behavior quality of the robot to the iterative process of the synergistic strategy algorithm,making the synergistic strategy more accurate and robust.(3)A heuristic guidance strategy is proposed to provide prior knowledge for the robot.For the grasping behavior,the effective exploration space is the area where the object is located;for the pushing behavior,the effective exploration space is the area outside the object but within the pushing distance.Based on target detection,the heuristic guidance strategy calculates the effective areas for the grasping and pushing behaviors respectively,provides prior knowledge,and reduces the invalid exploration space.(4)A manipulation behavioral selection assistance mechanism is proposed to solve the problem that the behaviors of robot do not match the environmental state.For the robot's persistently failed grasping behaviors,change the direction of grasping behavior based on grasping experience;for the robot's persistent pushing behaviors of a single object,change the behavior manner based on the pushing experience.The manipulation behavioral selection assistance mechanism compensates for the error in the calculation of the behavior value of the synergistic strategy,assists the robot in reasonable planning of behaviors,and improves the robot's grasping success rate and behavior efficiency.
Keywords/Search Tags:Deep reinforcement learning, Robot manipulation, Grasping and pushing, Synergistic strategy, Target detection
PDF Full Text Request
Related items