Font Size: a A A

Research On Automatic Image Annotation Based On Semi-Supervised Learning

Posted on:2019-09-27Degree:MasterType:Thesis
Country:ChinaCandidate:L LinFull Text:PDF
GTID:2428330566476051Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the popularization of networks and digital devices,various types of media image data,such as images,audio,video etc are rapidly growing,how to manage these unlabeled data reasonably and provide users with efficient browsing and search is a topic that researchers have studied extensively.Since the 1970 s,image retrieval has become a very active field of research.However,due to the heavy workload of annotation manually and the existence of "semantic gap",the development of image retrieval technology is greatly limited.In order to improve this predicament,the automatic image annotation technology has been widely used.Automatic image annotation describes the algorithmic process of assigning one or more descriptive keywords to an image.As so far,there are many automatic image annotation menthods have proposed by researchers.This paper further explores the image automatic annotation technology based on the predecessors.Firstly,it analyzes the research background and the research status of domestic and foreign of automatic image annotation.Then it discusses the key technologies of automatic image annotation.After that,it focuses on the main work: in the chapter 3,we proposes an automatic image annotation method based on a hybrid model;in order to use the unlabeled data to improve the performance of the annotation model,the fourth chapter describes an automatic image annotation method based on co-training and then make some comparative analysis of experimental results;the fifth chapter proposes an image annotation improvement method based on word relevance to remove the redundant label words.Finally,in the sixth chapter,we make a summary on our work and discussed the future work.The main research contents of the paper are show as follow:1.A hybrid model LDA-SVM is proposed for automatic image annotation.Firstly,the bag-of-words model is used to integrate multiple image features to obtain the word bag representation of the image.Then the LDA model is constructed to mine the latent topic of the image,and two latent distributions are learned from the training image: the latent topic distribution of the image and the visual word topic distribution of the image.The multiclass SVM classifier is trained by the topic distribution of the image which representation as an intermediate representation vector.In the annotation phase of the image,learning the latent topic distribution of the test image based on the visual word topic distribution of the train images which obtained from the training phase,and then combine the trained SVM to obtain the word sequence,and the five words with the highest probability are selected as the final keyword.We conduct the experiments on two baseline image datasets: Corel 5K and IAPR TC-12 and use the standard evaluation criterion to measure the performance of image annotation.Experiments were compared with several excellent image annotation methods,and the results have show the feasibility and efficiency of LDA-SVM.2.An automatic image annotation method based on co-training is proposed,which makes full use of unlabeled samples to improve the performance of image annotation.Based on the co-trining theory,different feature sets of images are extracted and two different classifiers,LDA-SVM and neural network,are used for co-training.The value of the unlabeled data is that it can effectively assist classifiers to perform intensive training,and the key of the semi-supervised learning methods is how to obtain the high-quality unlabeled samples.In this paper,we proposed an adaptive weighted fusion method to measure the confidence of unlabeled samples,and the two classifiers are retarined with the high confidence samples.Two large-scale datasets(IAPR TC-12 and NUS-WIDE)are selected for experiments.The comparison results show that the labeling performance of this method is better than most classical image annotation methods,verifying that the method can effectively use unlabeled samples to improve image annotation performance.3.An image annotation refinement method based on word relevance is proposed.The correlation between keywords is measured by weighted mutual information.Considering the inseparable relationship between the keywords and the images,in the process of calculating mutual information,the similarity between the image to be labeled and the related image in the training image set is added to construct a weighted mutual information model.After that,we consider the confidence of the initial annotation words to obtain the final confidence of the words,and then combine several highly confident words to form a keyword collection of annotated images.Selecting different numbers of initial tagging words to experiment,the experimental results show that this method can effectively refine the image annotation.
Keywords/Search Tags:automatic image annotation, co-training, topic model, neural network, support vector machine, image annotation refinement
PDF Full Text Request
Related items