Font Size: a A A

Research On Algorithm Of Content Based Near-Duplicate Image Retrieval

Posted on:2011-01-21Degree:MasterType:Thesis
Country:ChinaCandidate:B XiaFull Text:PDF
GTID:2178360305988638Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the development of multimedia technology, as well as the popularity of a variety of digital devices, a large number of images have been stored and transmitted in the digital form. Meanwhile, copying and distribution of images have become more convenient by the rapid development of the Internet. How to retrieve the image that users need is a serious problem. Therefore, image retrieval technology receives wide attention.In the early stage, text-based image retrieval has played an important role, but it is difficult to be retrieved according to image content, because manual annotation is limited to descript the image-rich content. In content-based image retrieval, Images with high-dimensional vector that its similarity measure is more difficult, and the semantic gap is a problem also. Because the near-duplicate searching is the key issue of content-based image retrieval, therefore, how to quickly and accurately retrieve the near-duplicate images will be a challenging task.This dissertation focuses on near-duplicate image searching. Specially, it contains the similar image retrieval algorithms and methods of searching near-duplicate images from large-scale database. To make it clear, the main content and contribution are listed below:First, key technologies of content-based similar image retrieval are introduced. The emphasis is put on the measurement of near-duplicate images, then, both the image feature extraction and indexing techniques of high-dimensional vector are discussed also.Second, as we know, global color histogram does not contain the spatial distribution relationship of colors, In order to avoid the problems that same color histogram for the different images can cause false retrieval, a sub-block weight setting method is proposed. In the method, an image was cut into sub-blacks firstly, after that the retrieval was performed by the weighted method of sub-block's color histogram. Experimental results show that the retrieval performance is improved effectively.Third, for high computational complexity, slow speed, and difficult to apply to large-scale sets problems of near-duplicate image retrieval, this dissertation presents a near-duplicate image retrieval method based on MD5. The method which selected the average of gray-block for feature. MD5 was generated by the feature after Laplacian Eigenmap dimensionality reduction and vector quantization, then it retrieves the near-duplicate or duplicate images by the MD5. Experimental results show that the algorithm is effective.
Keywords/Search Tags:image content, sub-block, near-duplicate image
PDF Full Text Request
Related items