Font Size: a A A

Fusing Depth And Texture Information For 3D Object Recognition And 6D Pose Estimation

Posted on:2021-01-26Degree:MasterType:Thesis
Country:ChinaCandidate:F ChenFull Text:PDF
GTID:2428330623465058Subject:Pattern Recognition and Intelligent Systems
Abstract/Summary:PDF Full Text Request
As artificial intelligence technology becomes more and more mature,the development of robots becomes more and more intelligent.Robots gradually integrated into people's lives and work,and replaced some of the repetitive work of human machinery.When interacting with the outside world,the robot needs to perceive the external environment with a variety of devices,such as cameras and three-dimensional sensors.How to effectively use the information collected by these devices to identify the target and estimate its 6D pose to assist the robot to complete the object grabbing task is very important for the actual engineering needs.Therefore,the paper takes "Fusing Depth and Texture Information for 3D Object Recognition and 6D Pose Estimation" as the research topic,and conducts in-depth research on neural network-based object recognition and 6D pose estimation tasks.The main work includes the following parts: 3D object recognition based on point cloud data,object recognition and segmentation under complex background,and 6D pose estimation of objects combining texture and depth information.1.convolutional networks cannot be directly applied to the point cloud data,because point cloud data is unstructured.Many methods map point cloud to image by hand-developed rules,and then use convolutional networks for feature extraction and classification.However,the manually mapping rules often result in the loss of imformation.In order to solve the shortcoming,we uses deconvolution operation to autonomously learn the mapping relationship between point cloud and image.The mapping relationship established through data-driven approach will retain useful information for subsequent classification task to improve the final classification accuracy.2.In order to recognize and segment objects in real complex scenes,we design a new semantic segmentation network which takes image and depth map as input.The network uses two independent backbone networks to separately extract features from image and depth map,and then connect the feature maps of correspondingscales together.The Pyramid Pooling Module(PPM)is used in the network for extracting global context information.Finally,a densely connected method is used to aggregate multi-scale features to provide rich and effective information for pixel-level classification.3.In order to obtain a lightweight and effective 6D pose estimation model,we performs model compression on DenseFusion.Based on the MixConv,a lightweight and effective color feature extracted backbone is built.Based on the backbone,the algorithm in this paper achieves more excellent results on two benchmarks,Linemod and YCB-Vedio dataset.In order to further compress the model,the Filter Pruning via Geometric Median(FPGM)algorithm is used to prune the iterative refinement network in DenseFusion.After applied the FPGM algorithm,our model keeps the original performance,but the model parameters and calculations have been greatly reduced.4.Based on the algorithm in this paper,we design a robotic arm grabbing system.After conducting a large number of grabbing experiments,we successfully verified the reliability and robustness of the algorithm in this paper.
Keywords/Search Tags:3D object recognition, 6D pose estimation, convolutional neural network, model compression
PDF Full Text Request
Related items