Font Size: a A A

Research On Deep Learning Based Multi-Label Image Learning Algorithm And Application

Posted on:2022-08-21Degree:DoctorType:Dissertation
Country:ChinaCandidate:Z M ChenFull Text:PDF
GTID:1488306725971949Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Images are the most basic and common information carrier for computer vision tasks.In general,images can be divided into single-label images(such as the image contains only “dog” or “cat”)and multi-label images(such as the image contains “sky”,“cloud” and “rainbow”).Research on image-based multi-label learning in computer vision aims to explore the multi-label information in images,and utilize multi-label information to improve computer vision tasks,such as multi-label image classification and generic object detection.Since the collected images often contain multi-label information in real-life scenarios,the research on multi-label learning for images has a wide range of real-world applications.However,since each image contains a dynamic number of labels,multi-label learning is more challenging than single-label learning.This dissertation is based on deep learning methods,and studying multi-label learning for images from two perspectives,i.e.,utilizing multi-label information to improve image multi-label classification and assist other computer vision tasks,and main results are summarized as follows.1.A multi-label graph structure for multi-label image classification.Capturing the relationship between labels is the key to solving the problem of image multilabel classification,and the existing methods are mainly divided into two categories:building inter-image label relationships and intra-image label relationships.Most current methods of constructing inter-image image relationships employ word embedding vectors as auxiliary information to construct graph nodes,and these auxiliary information may affect the performance.However,our method utilizes network structure to filter out the feature activation vector of each category as the graph node with no auxiliary information.Specifically,we first design an efficient method to mine the co-occurrence relationships between labels from the training dataset,and then use graph convolutional networks to encode these relationships into a graph structure.In addition,we also decouple the feature activation vector of each category from the global features of the input image,and use these feature activation vectors as the input node of the graph structure.During training process,the information between the feature activation vectors will be propagated through the directed edges of the graph structure,which will construct the interimage relationship between the labels.Empirical results on generic multi-label image recognition demonstrate that the proposed method can improve the accuracy of multi-label image classification task.Moreover,Our proposed method can also obtain the competitive results on partial label classification task,which demonstrates its good generalization ability.2.A metric learning approach for multi-label image classification.In addition to study how to properly construct the inter-image relationship of labels,we also explore how to build the intra-image relationship between labels.While most of existing methods use attention mechanism to establish the intra-image label correlations implicitly,which may not capture the intra-image label correlations exactly.We redesign the network structure to model the label correlations explicitly by using metric learning.Specifically,we first obtain class-aware disentangled maps(CADMs)by reforming deep activations in accordance with the class-specific recognition weights.Then,after transforming CADMs into the corresponding label vectors,we pull the relevant label vectors together and push irrelevant label vectors away to establish the intra-image relationship between labels.Besides,we also propose a ranking operation to further optimize the distance of these label vectors.Experimental results and visualization prove that this method can effectively establish the intra-image relationship between labels and improve the accuracy of image multi-label classification task.3.A multi-label contextual embedding method for generic object detection.Multilabel information can not only be applied in multi-label image classification task,but also can be used to assist other computer vision tasks.In this dissertation,we focus on the generic object detection task.Since ignoring the multi-label information,current two-stage detector may lack the crucial contextual information which is necessary for filtering out noisy background detections,as well as recognizing objects possessing no distinctive appearances.Although there are some methods dedicated to mining context information to improve detection performance,the performance of the detector is affected due to the inaccuracy of feature activation positions and the limited receptive fields.To address this issue,we utilize the image-level multilabel information to mine the contextual cues.Specifically,we first use image-level multi-label signals to embed contextual information into global CNN image features.Then,the hierarchical contextual Ro I feature is generated from both entire images and interested regions by exploiting these contextual information.Finally,in order to make full use of the hierarchical contextual Ro I features,we design two different fusion strategies to fuse the original Ro I feature to boost the classification accuracy of detectors.Comprehensive experiments demonstrate that our method is generalizable and flexible,leading to significant and consistent improvements to almost all mainstream two-stage detectors.Furthermore,we also extend this method on one-stage detector.The experimental results show that our method still has a good improvement on the one-stage detector.
Keywords/Search Tags:multi-label images, multi-label image classification, image object detection, deep learning, inter-image label relationship, intra-image label relationship, contextual information
PDF Full Text Request
Related items