Font Size: a A A

Digital Fingerprint-based Audio Retrieval System Design And Implementation

Posted on:2015-10-04Degree:MasterType:Thesis
Country:ChinaCandidate:X S GaoFull Text:PDF
GTID:2308330473951940Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
In recent years, with the popularity of multimedia technology, audio data has been in explosive growth in the network, which draws attention to the development of efficient retrieval of audio data classification method. Content-based audio retrieval system utilizes acoustic characteristics extracted from signals and those stored in the database for comparison to retrieve the metadata of audio signals(author, album, genre, etc.). Potential applications include automatic audio recognition, audio tracking, copyright protection, the TV program search, advertising background music detection and so on.This paper mainly implements content-based audio retrieval which uses digital audio fingerprint to identify audio files. Digital audio fingerprint is extracted from audio content which represents compact digital signature of important acoustic features of audio. Digital audio fingerprint is used to identify the audio indexing together with the corresponding metadata information content stored in the database. We compare digital audio fingerprint extracted from unknown audio files and those stored in the database to identify unknown audio files.This paper focuses on the important steps that influence system robustness: feature extraction, fingerprint models and match:First, this study compares some spectral features, including Mel-Frequency Cepstral Coefficients(MFCCs), Chroma Spectrum, Constant Q Spectrum and Product Spectrum. The first three feature extractions are only from the amplitude spectrum, which has been widely used in audio signal processing and key-point detection. The use of the product spectrum is the product of amplitude spectrum and group delay, which is very efficient in robust speech recognition. Experiments show that in the audio retrieval system, the extraction method feature based on the product spectrum in this article, is more accurate than the previous three feature extraction methods.Secondly, this paper presents a cumulative similar models in order to better extract the similarity among the audio data. Experiments show that the cumulative similar model is more efficient and accurate than Euclidean distance model.Thirdly, we use Gaussian mixture model to improve the robustness of audio retrieval system. Gaussian mixture model trains the audio database by using the EM. Gaussian mixture model can better describe features of the acoustic characteristics. By training a Gaussian mixture model, feature vectors of audio from the database and audio clips to be detected are converted into symbolic symbol mark and then retrieved in the database. Experimental results show the advantages of Gaussian mixture model, which remains accurate even in severe cases of noise distortion.Finally, we compare method proposed in the article with a current generic audio retrieval method- AudioDNA. The biggest difference between the method of this paper and AudioDNA is different acoustic feature extraction methods and similarity metrics. Experimental results show that the proposed method of this paper is more resistant to distortion caused by the noise attack.
Keywords/Search Tags:Digital audio fingerprint, Feature extraction, Gaussian mixture model, Pattern accumulative similarity, Spectrum feature
PDF Full Text Request
Related items