Font Size: a A A

Research On Multi-label Image Classification With Visual Attention And Context Correlation

Posted on:2020-04-29Degree:MasterType:Thesis
Country:ChinaCandidate:Q L MengFull Text:PDF
GTID:2428330611998714Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the development of the Internet,multimedia data continues to grow.Classifying large-scale multimedia data is a challenging task.One of the important reasons is that multimedia data can belong to multiple categories at the same time.Images occupy a large proportion in multimedia data,so multi-label image classification has received more and more attention.In multi-label image classification,an image can have multiple class labels.There are two main problems in the existing multi-label image classification method.The first problem is that the context correlation in the image is not considered,and the relationship between the class labels can be used to improve the classification performance.The second problem is that the spatial information in the image is ignored,which leads to the problem that the class label is wrongly associated with the image region.In order to solve these problems,this paper proposes three multi-label image classification methods.The main research contents of this paper are summarized as follows:Firstly,to solve the problem of ignoring the spatial information in the image,this paper proposes a multi-label image classification model based on attention mechanism.The model uses Res Net to extract features,and uses CNN to generate a corresponding attention map for each class label,then uses the attention map to weight the feature map,and finally uses the weighted feature map for classification.The experimental results show that the model can actively focus on the image areas corresponding to the class label,which improves the classification effect.Secondly,in order to make better use of the relationship between labels,this paper proposes a multi-label image classification model based on STN and LSTM.The proposed model uses VGG to extract features,then uses STN to implement the attention mechanism,and finally uses LSTM to get the relationships between labels and classify images.The experimental results show that the model can effectively find the target areas on the feature map,and then improve the classification effect.Finally,although LSTM can get the relationship between labels,it gets a local relationship.In order to make full use of the relationship between labels,this paper proposes a multi-label image classification model based on GCN and attention mechanism.The model uses Res Net to extract features while using attention mechanism to obtain weighted features,and then the two features are weighted and fused.GCN can learn the global relationship between labels and use the fused features to achieve classification.The experimental results show that GCN and attention mechanism can effectively improve the classification performance,and it is very competitive with the recent methods.Based on the above work,this paper designs and implements a multi-label image classification system,which can realize the function of image annotation,image retrieval and similar image search.
Keywords/Search Tags:deep learning, multi-label image classification, attention mechanism, context correlation
PDF Full Text Request
Related items