Font Size: a A A

Design And Implementation On The Audio Fingerprint Retrieval System Based On Big Data Platform

Posted on:2018-06-30Degree:MasterType:Thesis
Country:ChinaCandidate:X N KuangFull Text:PDF
GTID:2348330518495566Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the development of streaming media technology and social network and the arrival of large data era, more and more audio information appears in the Internet, which brings convenience to people's entertainment and also make it more difficult in storage and management.A large number of redundant data storing in the audio database caused a huge waste, while also increasing the difficulty of database maintenance,classification and retrieval. How to identify redundant audio data, to effectively maintain the audio database and to carry out retrieval in the massive audio data quickly and effectively become an important topic in the field of information retrieval research in academia and industry.Firstly, this paper introduces the business prospects and current research status of audio fingerprint retrieval technology, and explains the advantages of this technology and the necessity of the research, it belongs to the content-based audio retrieval technology,compared with the traditional text-based retrieval technology, the advantage is that the information retrieval is no longer dependent on the label and keyword of the manual tagging, but uses the time, frequency, amplitude, energy and other dimensions of the audio itself to retrieve, reducing on man power and greatly improve the accuracy and efficiency of the search. Then, the Echoprint, Chromaprint, Philips and other existing audio fingerprint retrieval technology are studied and compared. Based on the existing algorithm and technology, the paper proposes a fingerprint extraction algorithm based on FFT and hash algorithm and a fingerprint retrieval algorithm based on threshold fixed interval sampling, which has good robustness and anti noise, the retrieval accuracy and efficiency of algorithm are also greatly improved. At the same time, the paper optimize the audio fingerprint data to shorten the length of fingerprint and improve the efficiency of retrieval. And then, designed several storage structure based on the hash table, considering the occupation of storage space and the efficiency of retrieval, finally chose the Hash table based on dynamic array as the data structure audio fingerprints stored in memory. At the end of the paper, three kinds of big data platform such as Hadoop, Storm,Spark are studied, and the serialization of distributed storage scheme is put forward to improve the concurrency of algorithm, and at last, the paper build a set of high concurrency, high performance distributed audio data storage and retrieval system under the spark large data platform,which has important significance to the development of the audio fingerprint retrieval technology and its application.
Keywords/Search Tags:fingerprinting, distributed solution, hanning window, hash, FFT
PDF Full Text Request
Related items