Font Size: a A A

The Research Of Automatic Image Annotation Based On Thematic Analysis

Posted on:2013-12-14Degree:MasterType:Thesis
Country:ChinaCandidate:J W ZhuoFull Text:PDF
GTID:2248330371999816Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
As an important kind of information carrier in our living and studying, images are indispensable information resources on the internet. Huge amounts of image data are being created, distributed, spread, shared and exchanged. How to quickly and exactly find the image data in the sea is the hot topic in the field of image retrieval. Because of the limitations of the various factors in the development of the text-based and content-based image retrieval, semantic-based image retrieval was proposed. The key of the technology is semantic annotation of images. However, the visual information of image data obtained by computer semantic differs from semantic information of image data by user, and leads to the distance between the bottom and high-level retrieval, namely "semantic gap". That is the difference between image similarity discrimination "semantic similarity" by user and the image similarity discrimination "visual similarity" by computer. This makes the performance of automatic annotation of images semantic based on the image visual feature information is far from being able to achieve the desired results.This paper mainly focuses on the following research:1. Getting the training set through manual annotation has many shortcomings, such as time-consuming, effort-consuming and subjectivity, etc. Moreover, for the countless web images, the training set gotten through manual annotation is very small. Therefore, it is very important to get the training set with high-quality automatically. And it is very important to image semantic automatic annotation. Compared to traditional images, web images have their own distinctive features. For web images are stored in the internet, besides the visual feature information like traditional images, web images are often associate with plenty of text, such as file name, explanations, the page title, alternative text, etc.2. With the development of the social network, various multimedia recourses are shared and disseminated. Flickr which is an image sharing website is uploaded with several billion images in different classifications and themes. Images are annotated to varying degrees when being uploaded. Users can also annotated images uploaded by others which they are interested in. In this way, massive image annotations, that are social annotations, are created. If annotations of the training set can be extended and modified through these social ones, the quality of the training set will be improved greatly. Furthermore, every image belongs to specific community topics whose topic information is interrelated with high-level semantic information of images. As a result, mining the potential information of topics can improve the performance of annotating.The main innovations of this paper are as follows:1. Integrated image visual features and getting the training set with social annotation automatically. First of all, based on the thought of TF-IDF, we used the associated text information with constraints of the images to obtain the initial annotation of the images. That is the initial training set. Second, integrated image visual features and social annotations were used to extend the annotation classification. Considering the complexity and identity of image annotations on the internet, we preprocessed these annotations. Noise annotations are deleted from images with multi-annotation and for these non-annotation images, potential semantic information are mined according to the community topic information. After that, we can get images which are analogous in both key words and visual features of images in initial training set. And then the quality of the training set is improved. When annotations of the initial training set are extended by integrating visual features and social annotations of images, Based on the transmission of image similarity, the paper proposed a new image neighborhood set adaptive acquisition algorithm that Image visual characteristics are similar with label word sense. Through the comparison between the adjacent two images, the size of the neighborhood set which is analogous in both visual features and semantic key words of images can be identified adaptively.2. Automatic annotation based on the theme of the analysis of image semantics. For the automatic acquisition of large-scale high-quality training set, SVM method uses latent semantic analysis to obtain topic model through analysis of training image set, and then it utilizes the theme information of these images to automatically annotate semantic of images and obtains the initial annotation keyword set. Afterwards, potential topic information in the same topic and annotations of the visual neighbor images in different topics are used to extend the initial annotation key words set to improve annotation performance.
Keywords/Search Tags:Image automatic annotation, Automatic get training set, Social tags, Thematic analysis
PDF Full Text Request
Related items