Font Size: a A A

Research On 3D Object Recognitionusing Volumetric CNNs

Posted on:2022-08-30Degree:DoctorType:Dissertation
Country:ChinaCandidate:A.A.M.MUZAHIDFull Text:PDF
GTID:1488306722457694Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
The advancement of 3D modelling technologies and low-cost sensors has permitted the obtainment of the 3D model easier in real-time.In computer vision fields,3D object recognition is one of the most important tasks for many real-world applications including robot navigation and unmanned driving.With the great success of neural networks to understanding and analyzing digital graphical data,Convolution Neural Networks(CNNs)are the most popular and become the state-of-the-art neural networks of Deep Learning(DL)models for computer vision tasks such as image recognition.The recognition task is to identify objects with correct labels(categories)by using a classification algorithm in the CNN model.CNN can directly input the pixels values as features of the input image.However,making intricate 3D features is crucial for the advancement of 3D object classifications.3D objects are required to transform into CNN readable representations(i.e.,voxels,point-clouds and multiview based 2D image representation)of 3D data.The volumetric voxel representation provides rich information on the surface and geometrical features of 3D objects.The existing volumetric voxel-based CNN approaches have achieved remarkable progress in 3D object classification but the classification accuracy is not as good as that of 2D image classification.Volumetric CNNs requires large training dataset and they generate huge computational overhead that limits the extraction of global features at higher resolutions of 3D objects.Therefore,it is of great significance to investigate efficient volumetric CNNs to improve 3D classification accuracy.The research contents of this thesis focus on improving volumetric representation methods and developing advanced volumetric models for the 3D object classification task.Based on the research contents,this thesis proposes three volumetric models: MS-VDCNN,Curve Net and PC-GAN.We investigate the computational challenges as the high-cost approach forces to reduce the volume resolutions when applying to 3D CNN on volumetric data.We introduce a low-cost Multiscale Volumetric Deep Convolutional Neural Network(MSVDCNN)for 3D object classification based on joint multiscale hierarchical and subvolume supervised learning strategies.MS-VDCNN inputs 3D data,which are preprocessed by implementing memory-efficient octree representation,and we limit the full layer octree depth to a certain level based on the predefined input volume resolution for storing high-precision contour features.Multiscale features are concatenated from multilevel octree depths inside the network,aiming to adaptively generate high-level global features.The strategy of the subvolume supervision approach is to train the network on subparts of the 3D object in order to learn local features.The performance of MS-VDCNN framework has been evaluated with two publicly available 3D repositories.Experimental results demonstrate the effectiveness of MS-VDCNN where the classification accuracy is improved in comparison to existing volumetric approaches,and the memory consumption ratio and run-time are reduced significantly.We investigate the principal curvature directions of 3D mesh(using a CAD model)to represent the geometric features as inputs for the 3D CNN.We introduce Curve Net model that learns perceptually relevant salient features and predicts object class labels.Curvature directions incorporate complex surface information of a 3D object,which helps Curve Net to produce more precise and discriminative features for object recognition.Multitask learning is inspired by sharing features between two related tasks,where pose classification is considered as an auxiliary task to enable Curve Net to better generalize object label classification.Experimental results show that Curve Net framework using curvature vectors performs better than voxels as an input for 3D object classification.We further improve the performance of Curve Net by combining two networks with both curvature direction and voxels of a 3D object as the inputs.A CrossStitch module is used to learn effective shared features across multiple representations.The classification accuracy of Curve Net is evaluated using three publicly available datasets and it achieves competitive performance in the 3D object recognition task.We investigate the volumetric model from the perspective of the lack the labeled data.We introduce a novel Progressive Conditional Generative Adversarial Network(PC-GAN)for 3D object recognition by conditioning the input with progressive learning strategies.PC-GAN is a powerful adversarial model whose generator automatically produces realistic 3D objects with annotations,and the discriminator distinguishes them from the training distribution and recognizes their categories.We train the discriminative classifier simultaneously with the generator to predict the class label by embedding a Soft Max classifier.Progressive learning uses input samples from lower to higher resolutions to increase the generator performance gradually and produce informative objects for a certain class of objects.The key idea of adopting progressing learning is to mitigate overshoots issues of the discriminator and increase variations in the generated objects by learning progressively.This strategy helps the generator to produce more realistic synthetic objects and improve the active classification performance of the discriminator.PC-GAN is trained for object classification in a supervised manner and the performance is evaluated on two public datasets.Experimental results demonstrate that the adversarial PC-GAN outperforms the existing volumetric discriminative classifiers in term of classification accuracy.
Keywords/Search Tags:3D object recognition, 3D shape analysis, object classification, volumetric representation, deep learning, convolutional neural network, volumetric CNN, GAN
PDF Full Text Request
Related items