Font Size: a A A

Research On Constructing A Lexica Family Of Concepts With Small Semantic Gap For Image Retrieval

Posted on:2011-11-03Degree:MasterType:Thesis
Country:ChinaCandidate:J M LiuFull Text:PDF
GTID:2178360308952510Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
Recently image retrieval has become one of the most important research topics in Multimedia Information Retrieval (MIR). A fundamental challenge in image retrieval is the semantic gap, which is the lack of coincidence between the information that one can extract from the visual data and the interpretation that the same data have for a user in a given situation. In order to reduce the semantic gap, a promising paradigm of concept-based image retrieval focuses on modeling high-level semantic concepts, either by object recognition or image annotation. Among various approaches, the first step is to select a good lexicon that is relative easy for computers to understand, and then to collect training data to learn the concepts.Semantic gaps are actually not uniform for various semantic concepts. It is inappropriate to ignore the semantic gap difference. Choosing semantic concepts with small semantic gap is very meaningful because they can help train better high-level semantic concept models. There are two main problems: how to define"small semantic gap", which means how to measure"semantic gap"? How to find these concepts with small semantic gap automatically? This paper quantitatively analyzed semantic gap and proposed a novel framework of constructing a lexica family with small semantic gap based on different low-level visual features and different semantic gap models to address these two problems.The whole procedure for lexica construction is: firstly, for 2.4 million large-scale web images database, we extracted textual feature from their rich surrounding textual information. Several types of low level visual feature are also extracted and well indexed. Secondly, based on different semantic gap models, we calculated every image's content-context confidence score which measuring the consistence between distributions of images in visual feature space and textual feature space. Thirdly, we clustered the images with highest content-context confidence score by using affinity propagation clustering algorithm. Finally, we extracted keywords from the textual information of clusters by measuring every keyword's related degree. The keywords with highest related score constitute the final lexica family of high-level concepts with small semantic gap.Semantic gaps are different in various visual feature spaces. Based on color, co-occurence texture, wavelet texture, a visual feature based lexica with small semantic gap is constructed. These lexica provide feature selection for image retrieval with concepts. Based on two different semantic gap models -- loose-textual semantic gap and loose-visual semantic gap, a semantic gap model based lexica with small semantic gap is constructed. These lexica make suggestion of choosing search model for concepts and help improve performance of image annotation.In this paper, we choose affinity propagation clustering algorithm to cluster a large-scale image database. There are four reasons: 1) it don't need set definite number of clusters. 2) Its input is similar matrix. It is better than high dimension data point here considering both visual similarity and textual similarity. 3) Similarities between two images are asymmetric. 4) This algorithm is very effective and efficient when processing large-scale database. The experimental results demonstrate the validity of the developed lexica family. The lexica are independent to each other and mutually complementary.They provide helpful suggestions about data collection, semantic concept model construction, low-level feature selection, search model construction and image annotation for large-scale image retrieval.
Keywords/Search Tags:image retrieval, semantic gap, lexicon with small semantic gap, affinity propagation clustering
PDF Full Text Request
Related items