Font Size: a A A

Research On Single-view 3D Object Recognition And Pose Estimation Based On Deep Learning

Posted on:2021-06-15Degree:MasterType:Thesis
Country:ChinaCandidate:A LvFull Text:PDF
GTID:2518306353950889Subject:Robotics Science and Engineering
Abstract/Summary:PDF Full Text Request
The cheaper depth sensor makes people more and more accessible to threedimensional data.It enables service robots,unmanned vehicles and industrial robots to perceive the surrounding working environment from a higher dimension,and thus improves their work ability and efficiency.As a result of this,the depth camera has become one of the must-have items for smart machines.Having acquired the threedimensional data of the object,it is important to identify the object and estimate its pose in order to do other tasks.And this thesis is about 3D object recognition and pose estimation based on deep learning.In this thesis,the raw data captured by the depth sensor is preprocessed from the image level and then the 3D point cloud level.The time domain filtering method is used firstly to increase the stability of the depth data,and the statistical filtering method is used to remove the edge noise of the object.All these methods guarantees the quality of data source for the subsequent 3D-aware task.Secondly,for the problem that deep learning based 3D object recognition and pose estimation method may have difficulty in collecting training data sets,and the problem that the lack of data diversity may cause over-fitting easily,this thesis designs a scheme for generating synthesis datasets using 3D models of objects.Firstly,a fast 3D reconstruction system based on artificially marked circles was designed and implemented using an inexpensive depth camera.Then random sampling consistency is later used to remove the plane point cloud and extract the target object,and the relationship between the key points is used to extract their features.After feature matching,the spatial transformation relationship of the key points is calculated to registrate the target point clouds of different views.A simulation dataset is finally generated by multi-view virtual scanning.Finally an experiment is operated to verify the simplicity and reliability of this method.Then,this thesis clarifies three possible problems when neural network needs to extract point cloud features.This thesis uses the pooling operation to solve the disorder problem of point cloud.Aiming at the problem that the rotation angle of camera's z axis will affect the recognition result for PointNet in the local single view point cloud recognition task,this thesis designs a direction correction module,which can correct the rotation angle of the point cloud before the recognition task,and thus improve the point cloud recognition accuracy.Finally the spatial transformation network is used to predict the pose of the object.It is then proved by experiments that the iterative strategy can improve the accuracy of the object pose estimation result.Next,for the problem that the pose of the object is not well estimated only based on the geometric features and the iterative strategy,this thesis proposes fusion of texture features and geometric features to improve the accuracy of pose estimation.First,the pyramid scene parsing network is used to extract image texture features.Then the texture features and geometric features of the object are merged from different dimensions according to our fusion network and the mapping relationship between color image and depth image.Finally,an experiment is made to prove that the method based on point feature and confidence can improve the accuracy of object pose estimation result.In the end,the work of this thesis is summarized,and the future research work is planned.
Keywords/Search Tags:Object recognition, Pose estimation, 3D reconstruction, Point cloud, Feature fusion
PDF Full Text Request
Related items