Research On Semantic Based Image Annotation And Retrieval Algorithms

Posted on:2008-04-18

Degree:Master

Type:Thesis

Country:China

Candidate:T Zhang

Full Text:PDF

GTID:2178360245998133

Subject:Instrument Science and Technology

Abstract/Summary:

PDF Full Text Request

Automatic image annotation and semantic based image retrieval is essentially important for multimedia information retrieval. Image annotations allow users to access a large image database with textual queries. Traditional keyword-based image retrieval systems cost a large amount of human labors to annotate images; thereby the content-based image retrieval is brought forward, which tries to retrieve images directly and automatically based on their visual contents such as color, texture, and shape. However, content-based image retrieval faces a vital problem, namely"semantic gap"that exists between low level features and semantic concept.To solve this"semantic gap"problem, an automatic annotation method based on both vector quantization and Latent Dirichlet Allocation (LDA) has been presented in this thesis. First, the background and significance of automatic image annotation and semantic based retrieval are introduced. The structure of annotation models and their merits and defects are reviewed. In recent years, LDA model has been widely studied in the field of text based information retrieval. It has been approved by many researchers that LDA has a notable effect on discrete data processing and demission reduction. Obviously, there are amazing similarities between text data and image information: 1 both of their data are very huge and have high dimensions; 2 similar objects can be easily found in the same image collection and the synonyms always exist in one document corpus. In this thesis, mathematical principle of LDA model is elaborated, and the application method is presented for image data processing and semantic based image retrieval.In order to describe the meaningful region of images, the segmentation process is necessary. After comparison of the computation speed and segmentation results by different traditional image segmentation algorithm, the watershed algorithm is chosen. Since it has a problem of over-segmentation, a modified algorithm is proposed. Then, 18 features including color, texture, and shape are extracted from the regions after segmentation, which are clustered and condensed using vector quantization. Region features after clustering corresponds to"code words"in the"code book". The condensed"code book"can be regarded as a semantic lexicon in image database.Finally, the method is realized on the platform of MATLAB and applied to the Corel database of 400 images. The results show that the method can complete image automatic annotation and keyword based retrieval.

Keywords/Search Tags:

Image Annotation, Semantic Gap, Latent Dirichlet Allocation, Vector Quantization

PDF Full Text Request

Related items

1	Research Of Image Annotation And Tag Recommendation On Shared Resource Websites
2	Tensor Representation And Semantic Modeling For Image Annotation
3	Analysis Model Of Medical Text And Image Based On LDA And LSA And Its Application
4	Aurora Image Classification Based On Multi-Feature Latent Dirichlet Allocation
5	Research On Text Retrieval Based On Topic Analysis
6	Research On Rough Classification Of Academic Papers Based On Topic And Semantic Fingerprint Fusion
7	Design And Implementation Of A Text Recommender System Of Social Network Based On Latent Dirichlet Allocation
8	Researches On Automatic Image Annotation
9	Research On Image Annotation Based On Scene Semantic
10	Semantic-based Image Multiclass Annotation