Image hash, also called image digest or image authentication code, is an emerging technology of digital media security and multimedia applications. It can be applied to image authentication, tamper detection, image copy detection, image indexing, content-based image retrieval (CBIR), digital watermarking, and so on. An image hash is a content-based compact representation, which uses a short binary string to denote the image. In general, an image hash should satisfy the requirements of perceptual robustness, uniqueness, and security. Perceptual robustness means visually identical images should have almost the same hash with high probability even if their digital representations are not identical. Uniqueness is also called anti-collision capability, which implies that probability of two different images having an identical hash value, or very close hash values, should tend to zero. Security ensures that image hash should be sensitive to malicious tamper, and can't be predicted without the knowledge of the keys.This dissertation focuses on the framework, methods and performance evaluation of perceptual image hashing. After investigating the existing similarity metrics and hashing methods, I propose a perceptual similarity metric, two image hashing methods, and a novel hashing framework with an implementation based on discrete cosine transform (DCT) and non-negative matrix factorization (NMF). The contributions of this dissertation are as follows:1. I develop a perceptual similarity metric for application to robust image hashingTo measure perceptual similarity between an original image and its modified version, I propose an objective metric. This metric constructed by block structures can not only indicate the distortion introduced by normal processing, but also identify the local tampering. It shows better performance than PSNR and mean SSIM index in sensitivity to malicious tamper, and rotation resistance. 2. I propose two image hashing methods for tamper detectionTamper detection is an important task and challenging topic of image hashing. To this end, I design two methods based on human visual system and a technique of data reduction, respectively.Structural feature-based image hashing: since structural feature can represent the visual appearance of image, I exploit the structural features of blocks to construct robust image hashes. As an integrated part of the hashing algorithm, I define a new similarity metric that fully explores both perceptual robustness and anti-tampering sensitivity intrinsic in the obtained image hash.NMF-based image hashing: NMF is an effective technique of data reduction. I find that most pairs of adjacent entries in the NMF's coefficient matrix are basically invariant to ordinary image processing, but changed when tamper occurs. Base on the observation, a coarse quantization scheme is devised to compress the extracted features contained in the coefficient matrix. In hash generation, a secondary image is constructed to achieve the initial data reduction by using fewer vectors to represent the original image. NMF is then applied to the secondary image, and the quantization rule is exploited to make the coefficient matrix binary and then form the final hash.Theoretical analysis and experimental results show that the two methods above are both robust against perceptually acceptable modifications to the image such as JEPG compression, moderate noise contamination, watermark embedding, Gaussian filtering, brightness and contrast adjustment, gamma correction, and scaling. They show better performances than Fridrich's method, RASH method, and NMF-NMF-SQ scheme both in collision capability and tamper detection, indicating the usefulness of the techniques in digital forensics.3. I design a lexicographical framework for image hashing, and give an implementation based on DCT and NMFA lexicographical-structured framework to generate image hashes is proposed. The system consists of two parts: dictionary construction and maintenance, and hash generation. The dictionary is a large collection of feature vectors called words, representing characteristics of various image blocks. It is composed of a number of sub-dictionaries, and each sub-dictionary contains many features, the number of which grows as the number of training images increase. The dictionary is used to provide basic building blocks, namely, the words, to form the hash. In the hash generation, blocks of the input image are represented by features associated to the sub-dictionaries. This is achieved by using a similarity metric to find the most similar feature among the selective features of each sub-dictionary. The corresponding features are combined to produce an intermediate hash. The final hash is obtained by encoding the intermediate hash.Under the above framework, I implemented a hashing scheme using DCT and NMF. Experimental results show that the proposed scheme is resistant to normal content-preserving manipulations, and has a very low collision probability. Since the dictionary is constructed using a very large quantity of source images, it is virtually impossible to duplicate, and then make image hashes secure. As expected, a large dictionary, and taking more words from the sub-dictionaries for feature matching can lead to better performance. However, using too many words in the sub-dictionaries does not provide the performance advantage in a linear fashion, but only increases the computation burden linearly. |