Font Size: a A A

Research On Near-Duplicate Image Detection And Its Application

Posted on:2013-04-02Degree:MasterType:Thesis
Country:ChinaCandidate:B Y LinFull Text:PDF
GTID:2248330395480579Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
With the rapid development of multimedia and network techniques, people are immersed inmassive digital image information. Compared with text information, digital images are morevivid and understandable, which makes digital images be widely applied in various fields.However, many near-duplicate images are discovered among them, which brings redundancyconsequently. Thus, near-duplicate image detection technology emerges at the right moment,among which matching pattern learning and BoVW(Bag of Visual Words) are two mainstreammethods. This thesis conducts a deep research on near-duplicate image detection, and obtains thefollowing achievements:1. To solve the complex point-to-point matching problem of matching pattern learning, anear-duplicate image detection method based on random mapping and pattern entropy isproposed. Firstly, E2LSH which performs well in high-dimensional approximate nearestneighbor search task is used to filter most of the non-near-duplicate image pairs. Then, a pool ofvastly reduced image pairs is further invested by enhanced scale-rotation invariant patternentropy to remove the wrong matches. Experimental results show that the novel methodsignificantly speed up the detection without apparent degradation in performance.2. To solve the visual word synonymy and polysemy problem of BoVW, a near-duplicateimage detection method based on LSI(Latent Semantic Index) and soft-weighing is put forward.Firstly, LSI is utilized to reduce the dimensionality of large-scale visual vocabulary, and acompact semantic visual vocabulary is obtained. Then, a soft-weighing scheme is exploited toconstruct a visual words distribution histogram. Finally, histogram intersection is adopted tomeasure the similarity of two histograms, thus accomplish near-duplicate detection.Experimental results show that the novel method achieves higher detection accuracy, whileguaranteeing acceptable time efficiency.3. Researching on the application of near-duplicate image detection in image retrievalreranking, and a cluster-based image reranking method is brought forward. Firstly, an imagenear-duplicate graph is constructed based on the initial text-based retrieval results. Secondly,near-duplicate clusters are identified by mining the near-duplicate graph. Thirdly, thenear-duplicate clusters are ranked based on some given rules. Finally, a representative image isselected from each ranked cluster to form the final reranking results. Experimental results showthat the new method can not only rerank those irrelevant videos to the bottom of the ranking listbut also eliminate those near-duplicates from the list for diverse views.
Keywords/Search Tags:near-duplicate detection, matching pattern learning, random mapping, patternentropy, bag of visual words, latent semantic indexing, soft-weighting, image retrieval reranking
PDF Full Text Request
Related items