Font Size: a A A

3D Object Recognition Algorithm Based On Deep Learning

Posted on:2018-11-05Degree:MasterType:Thesis
Country:ChinaCandidate:R S LiFull Text:PDF
GTID:2348330536462018Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
With the development of 3D sensing technologies like Kinect,RealSense et at,recording high quality RGB and depth images becomes more convenient,and 3D object recognition based on RGB-D image has been widely concerned.Some researches indicate that combining RGB and depth image can obviously improve the performance of 3D object recognition.Feature extraction is the key point of object recognition,and deep learning algorithm has outstanding performance on autonomous learning and feature extracting,it has been an important research direction in the field of computer vision.For the 3D object recognition task,combining RGB and Depth image information,RGB-D object recognition algorithms based on supervised deep learning and unsupervised deep learning are presented.Firstly,an improved multi-modal deep convolution neural network(DCNN)algorithm based on supervised learning is proposed.In this improved algorithm,depth image is colorized by surface normal which is estimated by PCA,and we realize the model training and feature fusion by using CaffeNet model[50] as pre-training parameters and fine-tuning technology.Secondly,a feature extracting algorithm CNN-SPPL-RNN based on unsupervised learning is proposed,to solve the problem that CNN-RNN model[41] is not adaptive to the input image with different size,spatial pyramid pooling(SPPL)is introduced and used to extract translationally invariant features from CNN feature maps by using different spatial pyramid partitions,then the results of the SPPL are given as inputs to the random RNN networks to compose higher order features.Finally,based on the features extracted by unsupervised deep learning algorithm CNN-SPPL-RNN,a pose estimation algorithm with tree structure is presented.Using Softmax classifier to validate the performance of the proposed algorithm on RGB-D dataset,the results indicate that the category recognition accuracy of the improved multi-modal DCNN algorithm rises a little,and the instance recognition accuracy of that reaches amazing 96.9%,which higher 4.1% than HMP[38].Unsupervised CNN-SPPL-RNN algorithm achieves state of the art performance on category recognition with low dimension features.Besides,the proposed pose estimation algorithm also can locate the pose effectively.
Keywords/Search Tags:3D object recognition, Deep learning, Convolutional Neural Network, Spatial pyramid pooling, Recurrent Neural Network, Pose estimation, Softmax classifier
PDF Full Text Request
Related items