Font Size: a A A

Global And Local Features Fusion For Large-scale Image Retrieval

Posted on:2018-04-02Degree:MasterType:Thesis
Country:ChinaCandidate:X J LuoFull Text:PDF
GTID:2348330521950952Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
As the big bang of the image data on the web,large scale image retrieval has been a crucial topic in computer vision society.Especially,contented-based image retrieval(CBIR)has come to prominence because the users are eager to give an instance to search for similar images in real time.Applying inverted index based on the Bag-of-Words model with local features(e.g.SIFT descriptor)has been viewed as the state of art index method in CBIR systems.However,it will cause “sematic gap” between low-level image representation and high-level sematic concepts,which may lead to false matches.Since the deep features learning by Convolutional Neural Network has been put advanced,it demonstrates impressive performance in image retrieval.But this kind of global features might still fail in the cases with fine-grained differences.Therefore,how to integrate the power of the global and local features as complementary features and construct an efficient index framework to improve the precision of the CBIR system is a promising prospect.In this paper we propose inverted coupled-index framework for features fusion.Moreover,we introduce an optimized weighting scheme to estimate the importance of the visual word to image.There are two main contribution as follows.First,we employ inverted coupled-index to perform features fusion at indexing level.We divided image into several patches and take per patch as a visual word,then every patch can be represented by a features' tuple,which is the combination of the local representation(aggregated vector of SIFTs)and global representation(CNN).Inspired by the conventional Bo W model,our method is partitioning two different feature space into two sets of clusters,respectively.Then the visual word can be represent by two cluster center from two codebooks.Then the compact codebook corresponds to a finer partitioning,resulting in more discriminative visual words and less false matching.Second,we take the term frequency distribution and the topic correlation into account to perform the weighting estimation.According to our study,the conventional IDF formula always ignores the frequency distribution of the visual words,which indeed might lower some discriminative power,or sometimes cause magnification of the noisy occurrence.Optimized scheme can avoid these kind of problems.Moreover,we use topic model to learn the correlation between a visual word and a visual topic,and enhance the prominent distinctive power of the topic-relative word in the matching function.Then corresponding to multi-index,we design multi-IDF scheme for weighting estimation to improve the overall distinction of the visual word in the similarity computation.We leverage the inverted coupled-index to integrate global feature with local feature,which improve the precision of the CBIR system effectively.Furthermore,we apply MultiAssignment for quantitation and the hamming embedding algorithm for matching,which contribute to the efficiency and memory cost.We construct our novel model and implement a CBIR system on real image database,which have competitive performance comparing to other start of art methods and verify the superiority of our solution by both theoretical analysis and performance study.
Keywords/Search Tags:inverted coupled-index, bag-of-words, global feature, local feature, weighting estimation
PDF Full Text Request
Related items