Research On Storage And Search For Massive Data Of Audio Fingerprinting

Posted on:2015-04-11

Degree:Master

Type:Thesis

Country:China

Candidate:R T Wang

Full Text:PDF

GTID:2298330452959581

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

With the coming age of big data, the world is producing data at an exponentiallyincreasing speed, especially multimedia data such as images, audios and videos. Howto effectively manage and make use of these data to providing more convenient is oneof the fundamental problems people need to solve in the information age. As thedevelopment of techniques in pattern recognition, machine learning and cloudcomputing, content-based multimedia search comes in. Compared with traditionalkeyword-based search, content-based search is independent of tags and keywords, andwith more accurate search results and more convenient search methods.As the important component of multimedia data, the data amount of audios alsoincreases fast. The key problem of people faces is no longer lacking of data, but howto find data they want in massive data. And how to retrieve audios from large-scaledatabases effectively and efficiently is a big challenge for both academia and industry.Audio fingerprinting technologies is one of content-based audio search methods.By extracting digital features called audio fingerprints from the unknown audiosegment, and then search and calculate similarities in a prepared audio fingerprintdatabase, we can get detail information of that audio. This method avoids theproblems such as lack of tags or have wrong tags exists in traditional keyword-basedsearch. And at the same time, this method could help users find what they want evenwhen they donâ€™t know the keywords.The algorithms of audio fingerprints extracting and matching have achievedsignificant results in some laboratories, and have been applied to some products withrelative small datasets. However, large-scale datasets always introduce performancebottleneck, and problems about concurrency and extensibility.This paper designs, implement and optimize the storage and retrieval of massiveaudio fingerprints based on the deep research in algorithms of audio fingerprintsextracting and matching. This paper first introduces a hash-based structure for audiofingerprints and two distributed hash strategies. And prove the effectiveness of thosemethods by experiments. On those basis, a distributed serialization solution isintroduced and proved effective. The storage structure and distributed solutions has some features such asmultilevel concurrency, high performance, fault-tolerant and can be extended easily.These achievements have practical values for constructing large-scale audiofingerprinting retrieval systems and have significant meanings for the applications ofaudio fingerprinting technologies in modern society.

Keywords/Search Tags:

audio fingerprinting, big data, storage and retrieval, distributedstorage

PDF Full Text Request

Related items

1	Research On The Content-Based Audio Retrival
2	Study On Content Based Audio Information Retrieval
3	Research On The Key Technology Of Big Audio Retrieval
4	Design And Implementation On The Audio Fingerprint Retrieval System Based On Big Data Platform
5	Audio Fingerprinting Retrieval Systems Based On Compressed Suffix Array
6	Study On Sample Based Music Information Retrieval
7	Study On Content-based Audio Retrieval Technology
8	Audio Fingerprinting Technology And Its Application In The Copyright Of Broadcastâ€™s Music
9	Research On Acoustic Feature Analysis In Audio Retrieval
10	Audio Identification And Authentication Based On Digital Fingerprinting