Recognition of 3D objects is a fundamental task in computer vision.In recent years,multiview based deep learning has emerged as an effective approach for 3D object recognition.In this thesis,we aim at solving problems of existing view-based approaches and improving their effectiveness.At the very beginning,view-based approaches adopt view pooling layers to aggregate multiple view features of into a single compact one.However,such view pooling layers use a fixed pooling schemes(e.g.max or average operation),and thus they suffer from information loss or contamination.To solve this,we propose adaptive rank pooling layer,which can learn from data a proper pooling scheme by adaptively adjusting the weights assigned to each feature.Recently,view-based methods tend to mine the similarities among views.However,most existing view-based methods treat the views of an object as an unordered set,which ignores the dynamic relations among the views,e.g.sequential semantic dependencies.To address this issue,we propose to treat the views of an object as a sequence.We aim at exploiting the longterm dependencies among different views for shape recognition,which is done by constructing a sequence-aware view aggregation module based on the bi-directional Long Short-Term Memory network.Correspondences of object parts provide discriminative cues for 3D object recognition.However,existing multi-view based deep learning approaches have not explicitly exploited such correspondences.Besides,existing approaches ignore the spatial viewpoint setting of multi-view images,which encodes rich 3D relation among views.In this paper,we propose a plug-and-play module,called 3D-Aware Correspondence Learning module(3ACL module),that explicitly encodes the local intra-view/inter-view correspondences with explicit considerations on spatial settings of viewpoints.The 3ACL module can be easily plugged into any modern convolutional neural networks and be trained jointly.We conduct extensive experiments on three widely-used benchmark datasets to evaluate our proposed methods.Experiments show that our methods achieve state-of-the-arts results on 3D object classification and retrieval tasks,demonstrating the effectiveness of our methods. |