Font Size: a A A

Study Of Audio Authentication Techniques Based On Perceptual Hashing And Digital Watermarking

Posted on:2016-04-19Degree:DoctorType:Dissertation
Country:ChinaCandidate:J F LiFull Text:PDF
GTID:1108330485483275Subject:Information security
Abstract/Summary:PDF Full Text Request
In recent years, a lot of new achievements and technologies in information science, network fusions and other research fields, have been springing up continuously and have changed the traditional information transmission mode completely. Multimedia has become one of the most popular data interaction platforms such as internet, broadcasting television, mobile phone and so on. On the other hand, illegal copying, dissemination, fusion and tampering of the multimedia data have been arising simultaneously. These unauthorized behaviors seriously infring the owner copyright and credibility of the multimedia data, and hinder the national intellectual property protection. Protection of the copyright, authenticity and integrity of audio data, one of the earliest digitalized and most commonly used multimedia forms, is thus of great significance in the field of multimedia information security. Perceptual hashing and digital watermarking are two key techniques to implement the function of audio authentication. Based on audio watermarking and perceptual hashing, algorithms were proposed to protect the audio data. The achieved innovative results are listed below.1. Due to the scarcity of perceptual hashing algorithm for MP3 compressed audio and the shortage of the ability against collision, an audio perceptual hashing scheme based on non-negative matrix factorization (NMF) and modified discrete cosine transform (MDCT) coefficients was proposed. Firstly, the MDCT coefficients are obtained during the decoding process, and then divided into overlapping segments. The energy of each segments is calculated, and NMF is performed to generate perceptual hashing sequences. The experiment results show that, compared with Deng’s algorithm based on spectral energy, and Chen’s algorithm based on wavelet decomposition, the discrimination of FAR-FRR curves are better, and the difference between the maximum intra distance and the minimum inter distance has increased by 2.5%, which means that the discrimination and the ability of collision have been improved.2. In order to solve the problem that the existing audio perceptual hashing algorithm is sensitive to noise addition and low computational efficiency, an audio perceptual hashing based on Radon transform was proposed. Multiresolution wavelet decomposition is applied to the audio signal, and the approximate components are mapped into matrix as the feature of the audio data. Radon transform, which can reduce the dimension of matrix and is insensitive to noise addition, is used to derive the efficient acoustic features. Then DCT is applied on several random Radon projections to yield low-dimensional feature vector and generate hashing sequences. Experiments were carried out on two databases:the speech database and the music database. The results demonstrate that, compared with the existing Chen’s algorithms, the entropy rate of the proposed algorithm is increased by 0.22, the bit error rate against additive noise addition attacks is reduced by 0.5% at least; the computation speed is increased by 9.25 times; and the proposed algorithm is more robust against resampling, re-quantization, low-pass filtering and so on.3. In order to improve the ability of the speech content tampering detection, a speech content authentication method was proposed based on the correlation coefficient of Mel-frequency cepstrum coefficients (MFCCs). MFCCs are extracted from the segmented speech as perceptual features, and perceptual hashing sequence is generated by quantifying the correlation degree value of MFCCs. A similarity metric is used to detect and improve the accuracy of the speech signal tampering in the authentication. Simulation results show that the entropy of the proposed algorithm has increased by 0.26 compared with the contrast algorithm; meanwhile the false accept rate of the proposed algorithm is significantly lower than that of the contrast algorithm under the same threshold; the bit error rate is lower against content-preserving manipulations such as re-quantization, MP3 compression, and so on. In addition, it is very sensitive to tampering of speech by similarity metric.4. As binary image is not unique and easy to be lost and tampered when using as watermark, an advanced audio source authentication scheme combined fingerprint centroid perceptual hashing in discrete cosine domain and the digital watermarking was proposed. As one of the biological features, fingerprint image is unique in identification recognition. In this work, fingerprint perceptual hashing sequence is generated as watermarks and then embedded and associated into audio data for copyright protection and identification. Based on random blocks of fingerprint image, the centroids are calculated from the discrete cosine transform (DCT) coefficients of these blocks. Fingerprint perceptual hashing is generated by quantifying these centroids. Then the hashing bits as watermarks are imbedded into audio in the hybrid Discrete Wavelet Transform (DWT) and DCT domain. The extracted watermarks are matched with another perceptual hashing value from the fingerprint perceptual hashing database to authenticate the audio identity source. Experimental results demonstrate that the proposed fingerprint perceptual hashing algorithm has a good discrimination and exhibits strong robustness against noise addition, and can resist rotation attacks within 20 degrees. In addition, the watermarking scheme is robust against noise addition, low-pass filtering, re-sampling and so on.5. For the low imperceptibility of the existing ratio-based audio watermarking algorithm, a blind audio watermarking scheme was proposed using the norm ratio of approximate coefficients in DWT. The approximate components are divided into two parts, and then the stable ratio of p norm value of the two parts is calculated and quantified to embed the watermark. An optimized selection method of the scaling factors was proposed for modifying the coefficients. Experimental results indicate that the proposed watermarking scheme is robust against different common attacks, such as resampling, MP3 compression; meanwhile, the imperceptibility is improved as the SNR is increased by 3dB compared with that of Huang’s algorithm based on ratio in wavelet domain.
Keywords/Search Tags:Audio Authentication, Perceptual Hashing, Digital Watermarking, Discrimination, Robustness
PDF Full Text Request
Related items