Font Size: a A A

Research On Enhancing The Discriminant Power Of Convolutional Neural Networks

Posted on:2020-07-03Degree:DoctorType:Dissertation
Country:ChinaCandidate:F S HaoFull Text:PDF
GTID:1368330623955851Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Image recognition is a popular research topic in computer vision,and its success relies on discriminative features.Convolutional neural networks(CNNs),especially the deep CNNs,have achieved great success in the field of image recognition because of their strong non-linear representation capacity.However,whether in the case of largescale labeled training datasets or in the case of few labeled training samples,the existing methods do not fully exploit the discriminant power of the CNNs.Thus,the resulting deep features is less discriminative,which limits the performance of the existing methods.This dissertation is dedicated to enhancing the discriminativeness of the CNNs,and the main works are summarized as follows.1.A novel Anchor-based Angular Loss(AAL)is proposed,which is designed to impose the intra-class compactness and the inter-class separability simultaneously.The existing loss functions relying on class centers need to iteratively update class centers during training.However,due to the limited computing resources,it is often unrealistic to use the entire large-scale labeled training dataset for class centers updates in each training iteration.This dissertation proposes to replace class centers with anchors,where the anchors are predefined vectors regarded as the means for each class and fixed during training.The intra-class compactness can be achieved by constraining CNNs to map training samples to the corresponding anchors as close as possible.In addition,two principles are designed to ensure the anchors to be as separate as possible,so as to ensure the inter-class separability.Further,when adopting standard orthogonal basis as anchors,AAL can be implemented by only one normalization operation.The visualization experiments show that the AAL does enhance the discriminativeness of the CNNs.Meanwhile,the performance improvement in image classification and face verification also demonstrate the effectiveness of the AAL.2.A Semantic Alignment Metric Learning(SAML)method is proposed,which is designed to perform semantic alignment before comparing two images.The existing 3D tensor-based metric learning methods usually compare two images directly,which may cause ambiguity and weaken the discriminativeness of the CNNs because dominant objects can locate anywhere on images.To suppress the comparisons of semantically irrelevant regions,this dissertation proposes to perform semantic alignment before comparing two images through a “collect-and-select” strategy.Specifically,a relation matrix(RM)is calculated to “collect” the distances of each local region pairs of the 3D tensor extracted from the two images.Then,the attention technique is adapted to “select” the semantically relevant pairs and put more weights on them.Theoretical analysis demonstrates the generalization ability of the SAML and gives a theoretical guarantee.Empirical results demonstrate that semantic alignment is achieved.Extensive experiments on benchmark datasets validate the strengths of the proposed approach and demonstrate that the SAML significantly outperforms the current state-of-the-art methods.3.An instance-level fast embedding adaptation mechanism is proposed,which is designed to rapidly adapting embedding deep features to improve their generalization ability in recognizing novel categories.Existing metric learning-based methods lack a fast adaptation mechanism when dealing with novel categories.Since the sample distribution of novel categories is different,and the few samples sampled from these distributions are not always representative,these two factors limit the discriminativeness of the CNNs.This dissertation proposes a novel instance-level fast embedding adaptation mechanism to enhance the discriminativeness of the CNNs,which is achieved by an Attention Adaptation Module(AAM).After the embedding deep features are adapted,the cosine distance between the query instance and its corresponding class center is greatly increased,and meanwhile the cosine distance between the query instance and its non-corresponding class centers remain unchanged before and after the adaptation of class centers.Note that,the fast embedding adaptation is performed before performing nearest neighbor classifications,and thus the discriminativeness of the CNNs is enhanced.Experimental analysis shows that the AAM enhances the discriminativeness of the CNNs as expected,thus improving the performance of image recognition.
Keywords/Search Tags:Discriminativeness, Convolutional Neural Networks, Few-shot Learning, Semantic Alignment, Fast Embedding Adaptation
PDF Full Text Request
Related items