Font Size: a A A

Research On Robust Music Identification In The Compressed Domain

Posted on:2011-06-30Degree:MasterType:Thesis
Country:ChinaCandidate:Y D LiuFull Text:PDF
GTID:2178360305997310Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Digital audio fingerprint is a robust content-based compact signature that summarizes an audio recording, using it for efficient identification of a specific music clip from the vast music data on the Internet has come into being an important and urgent research field. Though MP3 has become in reality the most commonly used format for music storage and transmission, almost all previously published audio fingerprinting algorithms were still focused on the raw data format like WAV and very few schemes can be found directly operating on the compressed domain. In this paper, we propose two novel compressed-domain audio fingerprinting algorithm.In the first algorithm, the input MP3 music file is first partially decompressed to obtain MDCT coefficients as intermediate results, whereby we calculate the MDCT spectral entropy through consecutive long windows and come to the final fingerprint sequence by magnitude relationship modeling. Such fingerprint exhibits strong robustness against various frequency-and time-domain audio distortions, due to its statistically stable nature. Experimental results show that a five-second music clip is sufficient to identify its original recording in real time, with more than 90% top one precision rate even under various severe audio signal distortions.In the last algorithm, the compressed songs stored in music database and the possibly distorted compressed query music excerpts are first partially decompressed to obtain MDCT coefficients as the intermediate results, then by grouping granules into longer slots and remapping the MDCT coefficients into 192 redefined frequency lines to incorporate the originally different frequency distributions of long windows and short windows into a unified framework, we construct a series of auditory images to calculate the audio MDCT Zernike moments and come to the final fingerprint sequence by magnitude relationship modeling. Such fingerprint exhibits strong robustness, due to its statistically stable nature against various audio signal distortions such as recompression, noise contamination, echo addition, equalization, band-pass filtering, pitch shifting, and moderate time-scale modification etc. Experimental results show that our proposed method is very robust for fingerprinting.
Keywords/Search Tags:audio fingerprinting, compressed domain, MDCT spectral entropy, robustness, audio identification
PDF Full Text Request
Related items