| Robotic grasping is an important component of advanced robotic operating systems.A complete robot grasping system involves many technologies such as mechanism,image processing,and path planning.In industrial scenarios,the robot grasping system defaults all items are graspable in the same scene.In the indoor grasping system.robot grasp performance is limited by the size and shape of the object,the structure of the gripper,etc.For the common two-finger electric gripper cannot directly grasp flat objects such as packaging boxes,discs,and cards,etc.However,robot grasp strategy can adjust the position and direction of the object to satisfy grasping posture by sliding and pushing.This grasping strategy is called pre-grasping.Due to the uncertainty of the environment,adjusting the pose of the object requires complex judgment conditions.Therefore,it is necessary to find a grasping strategies for general grippers.On the other hand,in the complex desktop grasping tasks,there are difficulties include shape of the object,number of objects,position of the object.Those makes it impossible to describe the environment with mathematical models,making model-based grasping algorithms incompetent.According to the above problems,this dissertation mainly studies object pose estimation and pre-grasping strategies for flat objects.Start with vision and grasping decisions,this dissertation proposes a reinforcement learning method to allow the robot to learn the pre-grasping strategy,and provides three novel algorithms to improve agent’s sample efficiency and model convergence in sparse reward task.(1)To detect the desktop object pose,this dissertation provides a robust fourdegree-of-freedom object pose detection algorithm.Use image feature points to detect surface textures of objects and create corresponding image templates.According to the plane constraint characteristics of desktop objects,a reliable affine estimation strategy is designed to calculate the homography matrix of objects.Using the depth camera to minimization re-projection algorithm,the four-degree-of-freedom pose estimation of the desktop-level object is realized,which provides a stable and reliable pose estimation value for the subsequent grasping process.(2)In order to solve the problem that flat objects do not satisfy the grasping conditions,this dissertation uses the MuJoCo simulator to build a virtual environment for the above problems,and realizes the sliding grasping process through maximum action entropy reinforcement learning.In the sparse reward continuous control task,inverse region learning and success trajectory inspire algorithms are proposed.Solve the problem of unbalanced positive and negative samples in the early stage when the agent is exploring environment.According to the benchmark test,the algorithm ablation link is used to prove the effectiveness of inverse region learning and success trajectory inspire algorithms.(3)This dissertation proposes a reinforcement learning training strategy in hybrid action space to solve the problem of grabbing target objects from a desktop in a cluttered scene.Agent learn to weigh the impact of collisions between the target and other objects;Using curriculum learning and state decoupling,make agent adapt to the highdimensional state space.In the virtual experiment control,compared with the existing hybrid action strategy,the algorithm can obtain a higher grasping success rate with fewer training episodes;Action policies can be transferred to real-world scenarios.And in the generalization experiment,it is proved that the algorithm is still effective in higher dimensional state space,and the grasping strategy can be generalized to desktop objects of different shapes. |