Font Size: a A A

A Similar Image Search Engine Based On Millions Of Images And Distributed Computing

Posted on:2011-09-03Degree:MasterType:Thesis
Country:ChinaCandidate:Y ZhouFull Text:PDF
GTID:2178360302974627Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
This dissertation combines the theory of content based image retrieval and the technology of distributed computing, discusses how to buid a similar image search engine based on millions of real world images.First, we focus on solving the similar image search problem in content based image retrieval. We define two kinds of similar images: globally similar images that have similar structures or originate from the same content but with different sizes or qualities, and locally similar images that have the same scene, object or people. We have global image feature - Haar wavelet decomposition feature to cope with globally similar images and local image feature - SIFT feature for locally similar images.Second, to build an image search engine with millions of images, we will face the problem of accommodating and matching over one billion image features. To cope with matching huge number of high dimensional feature vectors, we studied and compared different algorithms, and chose Locality Sensitive Hashing (LSH) as our method for feature vector indexing and matching. We discussed different settings of parameters and strategies of implementation to fit LSH into our application.On the other side, to coped with enormous amount of the storage and computing for huge number of images and features, we leapt out of the limitations of traditional research and solved the problem by distributed storage and computing. We implemented a light-weighted distributed file system on Windows platform, and parallelized image feature extraction, indexing and matching using distributed computing. Experiments showed our design of the system was able to run on millions of images and handle user queries effectively and efficiently.
Keywords/Search Tags:Content Based Image Retrieval, Search Engine, Distributed Computing, Distributed File System, Image Wavelet Decomposition, Scale Invariant Image Feature, Locality Sensitive Hashing
PDF Full Text Request
Related items