Font Size: a A A

3D Object Recognition And Pose Estimation In Low-cost Vision

Posted on:2021-03-24Degree:MasterType:Thesis
Country:ChinaCandidate:Y R WangFull Text:PDF
GTID:2428330623965038Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Pose estimation makes use of the image information obtained by visual sensors to estimate the distance and posture between the target object and the visual sensor.It is one of the key components of practical applications including robot interaction with the environment and the virtual reality.The feature information extracted by the traditional computer vision method often cannot meet the needs of the application.The powerful feature abstraction and expression capabilities of deep learning in the field of computer vision provide a new solution for pose estimation.In addition,the development and application of sensors such as depth cameras and lasers also provide more diverse means to tackle the problem,but the application of these sensors depend on the material and shape of objects in structured environments.Multi-eye vision often has some drawbacks including installation difficulties and complicated debugging.In contrast,the visual sensor has a low price,fewer use restrictions,and is easy to be applied to robot platforms in different unstructured environments.Considering that the monocular pose estimation method based on the traditional method,it is difficult to overcome the problems such as complex background and occlusion in the unstructured environment.This paper uses the deep learning method to identify the target object to obtain the target object's scale and other characteristics,and then estimate more accurate pose information with gaining greater robustness and generalization ability.The main work of this work are:1)An end-to-end pose estimation model with multi-task is designed on the basis of deep neural networks.The model is divided into two parts: target detection and pose estimation.The loss function allows the model to directly estimate the pose of the object,avoids the use of the two-dimensional and three-dimensional mapping relationships commonly used in other methods,and simplifies the entire process.Based on the training method of the optical flow network,the synthetic data was used as a training set in the network training,and good results were obtained.Compared with the reference method,the method in this paper improves the accuracy of the pose prediction results by nearly 20%.2)The estimating accuracy from a single image is limited.A network for improving estimating accuracy is explored.Assuming a three-dimensional model of the object,the network is used to estimate the pose difference of the object between the real image and the rendered image.By using the pose difference in unit space,the adverse impact of the actual size of the object is eliminated.The introduced method improves the accuracy of pose estimation by 20%.In summary,this work uses deep learning methods to identify and estimate the objects,and conducts experiments on public data sets.The pose estimation of objects form a single image has achieved with good results.
Keywords/Search Tags:Pose estimation, Monocular vision, Deep learning
PDF Full Text Request
Related items