Font Size: a A A

Multi-label Image Recognition With Graph Neural Network

Posted on:2022-04-26Degree:MasterType:Thesis
Country:ChinaCandidate:W LiFull Text:PDF
GTID:2518306569994739Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In the real world,images usually are presented in the form of multi-label,which makes multi-label image recognition becoming a fundamental and important visual task.Also,multi-label image recognition is getting more and more popular in the wide computer vision scenarios.Compared with the traditional single-label image recognition,multi-label image recognition is more challenging as its complex and changeable objects and huge space for label combination.For the multi-label image,each label may be represented as multiple entities,and each entity is composed of an indefinite number of pixels with irregular distribution.Thus,pixels in different positions maybe have the connection with each other as they form a same semantic label together,and the connection can be named position correlation.Therefore,based on graph neural network,a method is proposed to model the position correlation in each image,so as to enhance the semantic feature interaction between the pixels with position correlation.With the position correlation learned,a simple and efficient decoupling attention mechanism is used to obtain more discriminative label-aware features,which can help the model recognize each label in the image more effectively,and make a contribution on the improvement of performance.Naturally,the co-occurrence labels on the same image often have the semantic relevance.In order to capture the correlation between different labels when the position correlation has been learned,this thesis continues to propose a multi-label image recognition algorithm utilizing the position and label dual correlation.For the two different correlation information,they can be integrated into a unified framework and trained in an end-to-end manner,and have the advantage of mutual promotion and collaborative optimization.For the label correlation learning module,this thesis also proposes a simple and effective loss function to perform the middle supervision,which can guide the model to learn label correlation and improve the performance of multi-label image recognition.The datasets used in the experiments are Microsoft COCO and Pascal VOC2007,which are the most authoritative and popular benchmarks in the field of multi-label image recognition.In this thesis,a series of detailed comparative experiments are conducted.Based on position correlation,the proposed multilabel image recognition algorithm is validated on two datasets,Microsoft COCO and Pascal VOC 2007.Compared with baseline model,the performance improvement of 4.3% and 1.4% are achieved on two benchmarks,respectively.Based on the position and label dual correlation,the performances have been further improved,achieving the m AP improvement of 4.8% and 1.7% on two benchmarks,respectively,which are the comparable results with the current startof-the-art results.
Keywords/Search Tags:multi-label image recognition, graph neural network, position correlation, label correlation
PDF Full Text Request
Related items