Font Size: a A A

Research On Image Classification Method Based On Label Correlation

Posted on:2020-04-03Degree:MasterType:Thesis
Country:ChinaCandidate:L ChenFull Text:PDF
GTID:2428330575996885Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the rapid development of internet,especially mobile internet,the world produces huge amounts of image data every day.These image data contain abundant information.In order to effectively utilize these information,it is important to classify and process images.Usually,an image contains multiple targets,scenes,behaviors and so on.Therefore,when classifying such images,it is necessary to use multiple labels to classify them.At present,most multi-label image classification algorithms neglect the semantic relevance between tags.In this thesis,multi-label image classification method based on the deep learning is studied in depth,and the recurrent neural network is used to model the tag correlation.The main contents of this thesis are as follows:1.The research background and current situation of multi-label image classification are described.Several methods of multi-label image classification based on deep learning are introduced in detail.Some important theories and key technologies of these methods are discussed.The advantages and disadvantages of these methods are analyzed.In addition,the classification evaluation metric and data set of multi-label images are briefly introduced.2.Many methods in the past often neglect the potential relevance between tags.This potential correlation between targets can improve the classification effect of multi-label images to a certain extent.A multi-label image based on recursive semantic association is proposed for this problem.This method uses a deep convolutional network to extract the semantic features of the target,following a specially designed attention network to separates the target features in the channel dimension of the convolution feature,and inputs the separated features into the LSTM network to model the correlation between the targets.The contrast experiments show that the proposed model can effectively extract the correlation between the tags,and to some extent improve the classification effect.3.The existing multi-label classification network is difficult to identify smaller targets,which leads to the low recall rate of the model for small targets.A multi-label image classification algorithm with feature pyramid combined with encoder-decoder architecture is proposed for this problem.Firstly,this method is inspired by the recent object detection network FPN.The feature pyramid network is used to extract the features of the image.The feature pyramid network fuses the high-level semantic information with the low-level detail features to form a larger resolution feature map to avoid ignoring the feature of small targets.Secondly,the method introduces the Encoder-Decoder architecture in the field of natural language processing.This architecture uses the bidirectional LSTM to encode the features of the convolution network and extract the association of the tags.A unidirectional LSTM is used for decoding and the classification result.Finally,experiments show that our method not only effectively improves the recognition of small targets,but also effectively captures the correlation between tags.
Keywords/Search Tags:deep learning, multi-label image classification, attention network, feature pyramid
PDF Full Text Request
Related items