Font Size: a A A

Perceptual Hashing For Image Copy Detection

Posted on:2015-01-16Degree:MasterType:Thesis
Country:ChinaCandidate:F YangFull Text:PDF
GTID:2268330431958488Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Perceptual hashing is a new topic in the field of multimedia information security, which has been applied in multimedia retrieval and indexing, digital watermarking, multimedia authentication, copy detection, and so on. Essentially, it is a one-way mapping function, which can convert any multimedia data into a fixed length and short sequence, called hash. For digital images, perceptual hashing algorithm can map any image into a short sequence of bits, characters or numbers. If two images have the same visual content, image hashes extracted via hashing algorithm will be identical or very similar. This is robustness of hashing algorithm. It can ensure those image copies generated by digital operation, such as JPEG compression, filtering operations, geometric transformation and image enhancement, can be identified correctly. As to those images with different visual contents, their corresponding hashes will be quite different. This is uniqueness of hashing algorithm, which can effectively distinguish different images. In addition to the above mentioned two properties, perceptual hashing algorithm has other requirement when they are applied in specific areas. For example, the application of image authentication requires security. This means that hash generation must be controlled by key and different keys will lead to different hashes.This thesis selects digital image as object, investigates perceptual hashing and its applications with discrete cosine transform (DCT), discrete wavelet transform (DWT) and principal component analysis (PCA), and obtains three interesting results. They are image hashing based on DCT and DWT, image hashing based on dominant DCT coefficients and image hashing based on PCA. Detailed contents are as follows.1. I proposed an image hashing based on DCT and DWTConsidering that the L*component in the CIE L*a*b*color space is consistent with the perception of human visual system, I extract perceptually robust image hashes from the L*components by combining DCT and DWT. Specifically, I firstly pre-process the input image to get its luminance component L*and divide it into non-overlapping blocks. For each block,2D DCT is applied, then the low-frequent coefficients of each block are extracted to form a feature matrix, and finally the feature matrix is decomposed with DWT and the coefficients in the LL sub-band are used to construct image hash. The proposed algorithm is robust to normal digital operations like JPEG compression, image enhancement and small angle rotation, and has good discriminative capability.2. I proposed a robust image hashing with dominant DCT coefficientsAfter studying the two-dimensional DCT coefficients, I propose a robust image hashing with dominant DCT coefficients in the first row/column. The proposed algorithm firstly pre-processes the image and divides it into non-overlapping blocks, then extracts dominant DCT coefficients in the first row/column from each block to construct two feature matrices, performs data normalization to the feature matrices, and extracts hash value by calculating and quantifying distance between columns of the normalized matrices. Experiments results show that the proposed hashing is robust to normal digital operations, such as JPEG compression, brightness adjustment and contrast adjustment, and has desirable discrimination.3. I proposed an image hashing based on principal components analysisPCA is an effective dimensionality reduction method, which can find a set of independent low-dimensional vectors to represent the original highly correlated high-dimensional vectors. Considering that there is a strong correlation among local pixels in an image, I propose an image hashing, which views image block as high-dimensional vector, reduce its dimension with PCA, and compress low-dimensional vector to make a small image hash. Specifically, the proposed algorithm firstly converts the input image into a fixed-size luminance component, then divides it into non-overlapping blocks, arranges the pixels of each block to form a new column and constructs a secondary image for data reduction. After that, the proposed algorithm applies PCA to the secondary image to obtain a set of low-dimensional vectors, and uses distance between these vectors to generate hash value. Experiments results indicate that the proposed PCA-based image hashing is robust to common content-preserving manipulations, and can achieve good discrimination to images with different content.To analysis classification performance of the above algorithms between robustness and discrimination, receiver operating characteristic (ROC) curve is exploited, and comparisons with classical algorithms, such as the GF-LVQ hashing and the RT-DCT hashing, are conducted. Experiments results illustrate that the proposed three algorithms have better classification performance than those compared methods. To verify performances of the proposed algorithms in copy detection, a large database with1200different images is first constructed, and then image copies generated by10different digital operations are inserted into the database. Copy detection is finally conducted on the database with1210images. Detection results show that, under a low false detection ratio, the proposed algorithms can find all image copies, indicating good detection performance.
Keywords/Search Tags:perceptual hash, image hash, image copy detection, discrete wavelet transform, discrete cosine transform, principal component analysis, CIE L~*a~*b~*color space
PDF Full Text Request
Related items