| 6D object pose estimation(3D translation and 3D rotation of the object coordinate system with respect to the camera coordinate system,i.e.,the position and orientation of the object)is widely used in computer vision and robotics,and is a prerequisite for industrial and service robots to perform intelligent tasks such as grasping and manipulating objects.To meet the real-time requirements of practical application scenarios,the 6D object pose estimation based on single-view data(single RGB image or single RGB-D image)has important research value.The existing single RGB image-based 6D object pose estimation methods have been able to robustly handle scenarios such as background clutter,weak texture,and occlusion in instance-level tasks(only estimates the 6D pose of objects with known CAD models),but the accuracy of pose estimation for small objects in images is low;the single RGB-D image-based 6D object pose estimation methods have been able to meet the needs of instance-level tasks,but are less robust to object shape changes within a category in category-level tasks(can estimate the 6D pose of unknown objects which belong to the same category as known CAD models).This thesis addresses the problems of low accuracy of the single RGB image-based instance-level methods for estimating the poses of small objects in images and the weak robustness of the single RGB-D image-based category-level methods for object shape changes within a category.The main studies are as follows:(1)The research status on single-view data-based 6D object pose estimation is reviewed,mainly investigating the methods based on deep learning,and categorizing the existing methods into instance-level and category-level according to whether the algorithms can estimate the 6D pose of unknown objects belonging to the same category as known CAD models,and analyzing the limitations and existing problems of these two types of methods.(2)To address the problem that the single RGB image-based instance-level methods have low accuracy in estimating the poses of small objects in images,an adaptive scaling and multi-scale information fusion 6D pose estimation network for small objects is proposed.First,the factors affecting the pose error of the two-stage methods are analyzed,and it is demonstrated that for the same keypoints location error,the pose estimation error of the object is inversely related to its size in the image;then,based on this analysis,an adaptive scaling strategy is designed,which adaptively scales the object in the image to the required size of the keypoint location network at the input,and adaptively scales the localized keypoints back to the original image at the pose solution;finally,a multi-scale information fusion module is constructed in the middle of the keypoint location network.Extensive experimental results verify that the proposed network model outperforms other existing related methods in terms of pose estimation accuracy on the benchmark dataset,especially for small objects in images.(3)To address the problem that the single RGB-D image-based category-level methods are not robust to intra-category object shape changes,an axial-plane and dense axial-vectors guided category-level 6D object pose estimation network is proposed.First,a dense axial-vectors guided module is constructed to enhance the prediction accuracy of the network for axial-vector;then,an axial-plane geometric feature module is designed as an intra-category strong similarity feature to improve the robustness of the network model to intra-category objects’ shape variation;finally,a pose and point cloud consistency constrained loss function is introduced.The experimental results on the benchmark dataset verify that the proposed network model outperforms other existing related methods in terms of pose estimation accuracy and is robust to intra-category object shape changes. |