Font Size: a A A

Object Detection And Pose Estimation Based On Neural Network

Posted on:2019-04-05Degree:MasterType:Thesis
Country:ChinaCandidate:Z Y ZhangFull Text:PDF
GTID:2428330572456310Subject:Engineering
Abstract/Summary:PDF Full Text Request
In recent year,the upsurge of computer technology has swept the world,computer visionrelated technologies have developed rapidly and matured.Object detection has always been one of the most popular research directions in the field of computer vision.Pose estimation refers to estimating the position and orientation of the target object relative to the camera,which is particularly important in 3D object recognition and detection.In this paper,object detection and pose estimation are processed simultaneously using a convolutional neural network.The main research contents of this paper are as follows:1,the traditional object detection algorithms are studied,and their advantages and disadvantages are analyzed.MVCNN requires a complete set of multi-view images recorded from all the predefined views for object classification.However,in the real scene,it is difficult to obtain the multi-view images of the object from all predefined viewpoint.Therefore,we proposed a method called rotationnet,our method is able to classify an object using a partial set of multi-view images,and this method treat view labels associated with images as latent variables,and outputs objects' s category by rotation,where the best pose is selected to maximize the object categoty likelihood.The problem of MVCNN detection method is solved well.The basic network of rotationnet is Alex Net.Based on this network,two strategies are proposed to combine the feature map of different depth layer in the network.The first strategy is overlaying the feature map in the depth direction.The second strategy is to fuse the corresponding location elements of the same dimension feature map.2,.On the basis of Faster R-CNN,the branch of pose estimation is added to the network structure of Faster R-CNN in different forms to extends the network structure of Faster RCNN,and obtains three different network models for the problem of joint object detection and pose estimation,namely single-path architecture,double-path architecture and doublenet architecture.We have designed them to gradually decouple the object localization and pose estimation tasks.For the task of pose estimation,there is two groups,one solve the pose estimation as a classification problem,i.e.the discrete approaches,other treat the viewpoint estimation as a regression problem,i.e.the continuous solutions.The typical loss functions in these two groups are study separately.3,the experimental part mainly focuses on the traditional detection methods and two methods of object detection and pose estimation in this paper.For the method based on latent variables,the classification and pose estimation accuracy of basic Alex Net,depth overlay network and feature fusion network are tested on Model Net10,Model Net40 and RGB-D respectively.For the method of object detection and pose estimation based on Faster R-CNN,single-path,double-path and double-network architecture are tested on Pascal 3D+ dataset,respectively.The discrete loss function and continuous loss function for pose estimation are experimentally tested.Finally,the best performance double-network architecture is compared with the existing joint detection method,which prove that the algorithm in this paper has a good research prospect.
Keywords/Search Tags:Object detection, Pose estimation, Latent variables, Faster R-CNN, Convolutional neural network, ModelNet, RGB-D, Pascal 3D+
PDF Full Text Request
Related items