Font Size: a A A

Research Of Content-based Image Copy Detection

Posted on:2016-03-29Degree:MasterType:Thesis
Country:ChinaCandidate:L LuoFull Text:PDF
GTID:2308330467982273Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the development of Internet, a lot of image editing softwares appear, moreand more copies of the images appear on the Internet, and spread fastly, which lead toa series of issues, such as infringement, counterfeiting, and database storageredundancy, and so on. The near-duplicate image detection is a key point of the imageresearch, which is to detect near-duplicate images from the query image library, inother words, to detect those images which have high similarity with the query images.The methods which make the near-duplicate images includes changing the image sizeand the contrast, rotating, cropping, inserting text, adding noise. The near-duplicateimage detection can be applied widely, such as image copyright protection, imageforgery detection, video copy detection, image query and so on.The difficulty of the near-duplicate image detection is how to extract and matchimage feature more efficient. For the shortcomings of lower efficiency and accuracy, anew detection algorithm based on MSER, SURF and spatial pyramid model is raised.Firstly, the MSER and SURF features of images are extracted. Secondly all thefeatures are clusted by the k-means algorithm in order to form a visual dictionary.Finally the spatial pyramid model is used to integrate spatial information into theimages’ feature information, as a result, the recall and precision rates of near-duplicateimage detection are improved. The experimental results show that this algorithm isfeasible in large-scale near-duplicated images detecting.The traditional bag of words model uses the K-means algorithm to cluster theimages’ feature, which lead to the synonymy and ambiguity of the visual vocabularies.The redundancy of the visual vocabularies are not fit to dynamic expansion, so we usethe GMM which based on K-means algorithm to cluster the images’ features, in orderto generate more reliable visual vocabularies. Then in order to obtain the spatialinformation of the targets in the image scene and potential topics discriminableinformation, we use PLSA to integrate more information of the image into the BoW,in order to increase the accuracy of the copy detection. The experimental results showthat this algorithm is feasible.
Keywords/Search Tags:near-duplicate image detection, visual vocabulary, Bag of Words, SpatialPyramid Model, Gaussian Mixture Models, Probabilistic Latent Semantic Analysis
PDF Full Text Request
Related items