Font Size: a A A

Research On The Automatic Image Annotation And Annotation Refinement Algorithms

Posted on:2013-01-28Degree:DoctorType:Dissertation
Country:ChinaCandidate:H Y SongFull Text:PDF
GTID:1118330371481387Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the popularity of digital camera and other digital imaging products, imagesare created in a cheaper and easier manner. With the popularity of computer network,microblog, and some image sharing websites, such as Picasa and Flickr, thesetechnologies accelerate the spread of images. It is urgent to advance the research ofimages so that the capability of images management and understanding could keepconsistent with the increasement of images. Automatic image annotation is the coretechnique for image management and understanding. In this paper, our research isfocused on narrowing down the semantic gap, improving the annotation performanceand efficiency. The main works and innovative points are as follows:(1) The improved CMRM algorithm is proposed, and the general formula ofrelevance model is presented. After that, the interpretations of general formula aregiven in the views of probability theory, information retrieval and multi-modalrespectively. And based on that comparison and analysis, we point out core techniquesof relevance model and research emphasis of this area in the future. The experimentalresults show that improved CMRM outperforms original CMRM.(2) The identification vector annotation method based on positive and negativeexamples is proposed. To improve accuracy, more abstract models and complicatedalgorithms are used in traditional classification-based annotation methods, but thisleads to a worse performance in terms of efficiency. Our proposed method is based ontrans-media. By constructing an identification vector in visual feature space for everytextual keyword or semantic concept, we can convert image annotation problem intodetermining the most similar identification vectors. The identification vector oftextual keyword is obtained by difference between the mean vector of its positiveexamples and that of negative examples. Compared with traditional methods, ourproposed method has (a) a simpler model,(b) lower time cost for training and testingprocess,(c) better annotation performance in terms of mean per-word. Our proposedmethod can be used as an independent tool for image annotation. Much more than this,it can be a reference method to determine whether a more complex model or feature descriptor necessary to be used.(3) A new image annotation method named LLPLSA, based on local learning, isproposed. In the training process of this method, object selection, weight distribution,and parameter setting are all learned from local feature space of un-annotated (newcoming test) image. Although existing PLSA image annotation methods describeimages using high-level semantic topics, their performance is often unsatisfactory,outperformed by other methods using low-level visual feature descriptors. The timecost of training process for PLSA is very high due to the iterative EM algorithms.When the size of training dataset increases, the rapid growth time and space cost willhinder its application in large-scale image dataset. So, many researchers conclude thatPLSA methods cannot be applied to middle-or large-scale image dataset. In ourproposed LLPLSA, instead of the whole training dataset, only fixed-number images(10-20images) related with un-annotated image are involved in the training process.Thus the determination of relevant image and similarity computing are based on thecombination of multi-modal information and the context of feature spaces. As thenumber of relevant images is independent of the size of the whole dataset, LLPLSAcan be seen as a scalable model that means it can be applied to large-scale imagedataset. To improve the quality of training model and narrow down the semantic gap,we propose a weighted PLSA model in LLPLSA. To improve the adaptive ability ofLLPLSA, the parameters of model (e.g. topic number) are determined dynamicallyby the local space information associated with images involved in training process. Interms of F1-measure, the experiments on Corel5k dataset and IAPR TC-12datasetshow that LLPLSA outperforms PLSAWORDS by63%and75%, respectively, andalso outperforms the majority of other mainstream image annotation methods.Experimental results on Corel5K and IAPR TC12demonstrate that LLPLSA is ascalable model, whose time cost is independent of the size of training set.(4) Two image annotation refinement methods are proposed, including mutualinformation based annotation refinement (MIAR) and weighted mutual informationbased annotation refinement (WMIAR). In MIAR, mutual information is used tomeasure relevance. In the annotation refinement process, the relevance of candidateand confirmed words is determined in order of their confidence values in initialannotation result. Annotation refinement is achieved by discarding noise words whoserelevance values are below specified threshold. In WMIAR, weighted mutualinformation is used to measure relevance. When computing the relevance value, instead of simply counting the co-occurrence of words, our proposed methodcombines the co-occurrence frequency with similarity between un-annotated imageand training image where candidate annotations occur. When computing therelationship between words, WMIAR sets weights based on confidence values ininitial annotation result. By considering both the global general rule of relationshipbetween words and specific visual information of un-annotated images, WMIAR canaccurately describe the relationship between non-annotated image and candidateannotation words. From the experimental results, our proposed MIAR and WMIARalgorithms can get a better performance by discarding some noise or reorderingcandidate annotation words.
Keywords/Search Tags:Image Annotation, Annotation Refinement, Relevance Model, ProbabilisticLatent Semantic Analysis, Weighted Mutual Information
PDF Full Text Request
Related items