Deep learning has been widely used in computer vision tasks,especially in 2D image classification and segmentation.Recently,the availability of large 3D repositories and computational power makes it possible to apply deep learning techniques to 3D data.However,even though deep learning has been very popular in the analysis of 2D images,applying deep learning techniques to 3D data comes with different challenges.Unlike 2D images that have a regular structure,3D data do not have a regular structure.As a result,researchers used to either transform 3D data into a regular form or refined convolution operations to overcome the irregularity of 3D data.In this work,we use a hybrid representation of 3D data(point clouds and volumetric)to overcome the structural problems associated with 3D data.The main key contributions of this work are as follows:·We introduced a new deep convolutional neural network for 3D objects classification and part segmentation tasks.Our network architecture directly takes point clouds and volumetric as input and outputs object category or part label.The key to our approach is to embed point clouds into a volumetric gridvoxel structure to utilize the flexibility of point clouds and the regular structure of volumetric for 3D convolution.·We introduced a voxel feature encoder that captures the local relationship of the point cloud within the voxel before feeding them into the deep network.We design and investigate three different types of voxel feature encoders,each with its advantage and disadvantage.The main objective of the encoders is to extract the local geometric features of point clouds in an effort to improved our network’s ability to capture more fine-grained details of 3D objects.·We verify how the local relationship of points within the local group of point clouds is key to extracting more fined-details of 3D objects.Also,we verify how the size of voxels affects the time complexity of 3D convolution operations.Furthermore,various local region sampling approaches were applied for sampling points within the voxel.Finally,we tested our deep convolution neural network on ModelNet10,ModelNet40,ModelNet50,XmuNonRigid and Schelling benchmark for classification task and ShapeNet-part benchmark for semantic part segmentation.In the 3D classification task,our proposed method achieved classification accuracy of 93.4%,88.2%,87.7%,95.0%and 92.5%on ModelNet10,ModeINet40,ModelNet50,XmuNonRigid and Schelling benchmarks respectively.While in the semantic part segmentation task,our model achieved mIoU accuracy of 83.1%,on the ShapeNet-part benchmark. |