Font Size: a A A

Research On Semantic Based Image Annotation And Retrieval Algorithms

Posted on:2008-04-18Degree:MasterType:Thesis
Country:ChinaCandidate:T ZhangFull Text:PDF
GTID:2178360245998133Subject:Instrument Science and Technology
Abstract/Summary:PDF Full Text Request
Automatic image annotation and semantic based image retrieval is essentially important for multimedia information retrieval. Image annotations allow users to access a large image database with textual queries. Traditional keyword-based image retrieval systems cost a large amount of human labors to annotate images; thereby the content-based image retrieval is brought forward, which tries to retrieve images directly and automatically based on their visual contents such as color, texture, and shape. However, content-based image retrieval faces a vital problem, namely"semantic gap"that exists between low level features and semantic concept.To solve this"semantic gap"problem, an automatic annotation method based on both vector quantization and Latent Dirichlet Allocation (LDA) has been presented in this thesis. First, the background and significance of automatic image annotation and semantic based retrieval are introduced. The structure of annotation models and their merits and defects are reviewed. In recent years, LDA model has been widely studied in the field of text based information retrieval. It has been approved by many researchers that LDA has a notable effect on discrete data processing and demission reduction. Obviously, there are amazing similarities between text data and image information: 1 both of their data are very huge and have high dimensions; 2 similar objects can be easily found in the same image collection and the synonyms always exist in one document corpus. In this thesis, mathematical principle of LDA model is elaborated, and the application method is presented for image data processing and semantic based image retrieval.In order to describe the meaningful region of images, the segmentation process is necessary. After comparison of the computation speed and segmentation results by different traditional image segmentation algorithm, the watershed algorithm is chosen. Since it has a problem of over-segmentation, a modified algorithm is proposed. Then, 18 features including color, texture, and shape are extracted from the regions after segmentation, which are clustered and condensed using vector quantization. Region features after clustering corresponds to"code words"in the"code book". The condensed"code book"can be regarded as a semantic lexicon in image database.Finally, the method is realized on the platform of MATLAB and applied to the Corel database of 400 images. The results show that the method can complete image automatic annotation and keyword based retrieval.
Keywords/Search Tags:Image Annotation, Semantic Gap, Latent Dirichlet Allocation, Vector Quantization
PDF Full Text Request
Related items