Research On Perceptual Audio Hashing

Posted on:2011-05-16

Degree:Doctor

Type:Dissertation

Country:China

Candidate:Y H Jiao

Full Text:PDF

GTID:1118360332457970

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

Perceptual hashing, also refers as robust hashing and fingerprinting, maps digitalmultimedia data into a compact digital digest. Unlike cryptographic hashing, perceptualhashing is tolerant to content preserving operations and sensitive to perceptual changesof content. Different digital representations of multimedia would be mapped into thesame digest when the multimedia is of the same perceptual content, and multimedia ofdifferent contents would be mapped to distinct and statistical random hash values. Assuch, perceptual hashing finds good applications in content authentication, as well ascontent-based identification, indexing, retrieval and etc.The integrity of digital audio is essential to human property right, credibility ofpublisher, and even the national security. The research on perceptual audio hashing hasbecome an actively studied area of multimedia processing and security. Audio refers to thesound that is capable of being heard. Music and speech are two typical audio signals. Asthey are different in signal characteristics, coding scheme, transmission channel and etc.,specific perceptual hashing algorithms should be developed respectively. Music is usuallytermed as wideband audio in signal processing research area. There are four categories:raw wideband audio, compressed wideband audio, raw speech and compressed speech.At present, the research on perceptual audio hashing is still in its elementary stage.Although several algorithms have been proposed, the universal model and performanceevaluation methods, which are important for algorithm optimization and testing, are stillabsent. Moreover, most of the proposed algorithms are applied to raw wideband audio.There are some drawbacks when they are applied to compressed wideband audio. Further-more, they cannot be applied to speech authentication because of the difference betweenwideband audio and speech. This dissertation systematically summarizes the research sta-tus of perceptual audio hashing, and studies the modeling and the performance evaluationof perceptual hashing. Specific algorithms for compressed wideband audio, raw speechand compressed speech are developed respectively in this thesis. The main innovativecontributions of this thesis are as follows:(1) Based on the perception theory, the standard description of perceptual hashingare presented, including definition, technique framework and properties in mathematical form. In this thesis, perceptual hashing is modeled as a Markov information source, andthe entropy rate of the Markov source is proposed as a joint quantitative measure of per-formance evaluation. First, the proposed model and measure are independent of algorithmand suitable for black box testing. Second, entropy rate is a unit information amount andnot affected by the size of hash size. Therefore, it could be used for joint evaluation ofdiscrimination power and compactness. Third, there are upper bound and lower bound ofentropy rate. The value of entropy rate is a absolute indicator of algorithm performance,which clearly shows the distance between the tested algorithm and the optimum goal.(2) Compressed domain audio hashing algorithms are proposed in this thesis. Theperceptual hash is calculated from MDCT coefficients which are derived by partial de-coding of compressed audio bitstream. The proposed method is highly robust to MDCTbased audio compression and transcoding. There is no complicated transformation in theproposed algorithm, therefore, it is of low computational complexity. It is practical insome scenarios which have strict requirement of memory and computational overhead,such as network audio online retrieval.(3) A novel perceptual hashing for raw speech based on speech production model isproposed in this thesis. Perceptual hash is calculated based on linear spectrum frequencies(LSFs) which model the vocal tract. The hash function is key-dependent and collisionresistant. Meanwhile, it is highly robust to content preserving operations as well as havinghigh accuracy of tampering localization. Moreover, the proposed method is not limited tospeech coders, and practicable for all types of speech communication systems.(4) Speech coded at very low bitrate requires hash algorithm with high compactnessand robustness. G.729 and MELP are two typical low bit rate speech coding standards.Perceptual hashing algorithms integrated with them are proposed in the thesis. LSF couldmodel the changing shape of the speaker vocal tract and is the intermediate result ofpartial decoding. They are used to generate hash value. The proposed methods satisfythe robustness and discrimination requirement of perceptual hash with very low hash bitrate. It is also a computational efficient algorithm which could be applied to scenarioswith power restriction or real-time communication requirement.

Keywords/Search Tags:

Perceptual Hashing, audio, speech, multimedia content authentication, per-formance evaluation, compressed domain algorithm

PDF Full Text Request

Related items

1	Research On Speech Perceptual Hashing Authentication Algorithm And Security Analysis Based On Compressed Domain
2	Research On Perceptual Hashing Authentication Algorithm For Multi-format Audio
3	Research On Information Hiding In Speech Perception Authentication System
4	Research On Speech Perceptual Hashing Authentication Method And Its Application In Mobile Terminal
5	Research On Long Sequence Speech Perceptual Hash Authentication Algorithm Based On Multi-feature Fusion
6	Study On Identity Verification And Content Authentication Of Speech Based On Perceptual Hashing
7	Research On High-efficiency Speech Perceptual Hashing Authentication Algorithm Based On Instant Speech Communication
8	Research On Encrypted Speech Authentication And Recovery Algorithm Of Resisting Tampering Attacks
9	Research On Security Analysis Method Of Speech Perceptual Hashing Authentication Algorithm
10	Research On Encrypted Speech Content Authentication And Tampering Recovery Method Based On Perceptual Hashing In Cloud Environment