Distributed Implementation Of The Massive Audio Retrieval Algorithm

Posted on:2019-04-27

Degree:Master

Type:Thesis

Country:China

Candidate:Y Xin

Full Text:PDF

GTID:2348330569479989

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

Content-based audio retrieval technology is widely used in audio identification,digital audio content tamper identification,humming retrieval,broadcast monitoring and music recommendation.The robustness of the audio fingerprint and the efficiency of the retrieval algorithm directly affect the user's experience and are key factors for audio retrieval systems.Currently,audio fingerprint extraction algorithms and audio retrieval algorithms have achieved fruitful results in terms of robustness and accuracy.When applying these efficient algorithms to massive audio data sets,A standalone server cannot meet the needs for the required storage capacity,retrieval speed declines,and expansion is restrained.To address these problems of massive audio retrieval,a distributed implementation of the sampling-counting audio retrieval algorithm is proposed,The sampling-counting audio retrieval(SC)is one of the high efficient audio retrieval algorithm for a standalone server.A serialized Fibonacci hash table and a segmented implementation of the distributed index are employed to solve the key issues of distributed audio retrieval systems,the choice of the structure and the distribution of the fingerprint index.Using a serialized Fibonacci hash table structure can save storage space without slowing the search speed.The use of the Fibonacci hash function can reduce the number of hash buckets.The serialization of Fibonacci hash tables reduces the memory used by each hash bucket,and improves memory utilization.The distributed structure of grouped indexes uses S ?M = N data nodes to divide data nodes into S groups.Each group contains M data nodes.The distributed structure using local indexes between groups reduces the data volume of each group of indexes.In the group,a globally distributed structure is adopted.Each group's hash table is divided into M shares equally and distributed to M data nodes in the group.The distributed structure of the global index of the hash table is used to reduce the number of groups in the group.Data node retrieval task.When the index search is completed,the search results of the data nodes in the group corresponding to the audio data set need only be summarized,thereby reducing the communication cost of the cluster.The experimental results show that the distributed partition method of serialized Fibonacci hash table and grouped index is applied to the sampling-counting audio retrieval algorithm,which can effectively shorten the retrieval time and reduce the communication of the cluster while ensuring the accuracy and recall rate,improve memory utilization.

Keywords/Search Tags:

Philips audio fingerprint, distributed audio retrieval, sampling-counting retrieval algorithm, fingerprint index structure, distributed structure

PDF Full Text Request

Related items

1	The Research Of Segmented Audio Retrieval Algorithm Based On Audio Fingerprint
2	Efficient Retrieval For Massive Audio Based On Content
3	Index Structure And Retrieval Approach Of Very Large Scale Fingerprint Database
4	Key Technology And Implementation Of Audio - Based Retrieval System Based On Content
5	The Music Retrieval Technology Based On Audio Fingerprint And Version Identifeication
6	Audio Fingerprint Oriented Quantum Hash Technology Research
7	Digital Fingerprint-based Audio Retrieval System Design And Implementation
8	Design And Implementation On The Audio Fingerprint Retrieval System Based On Big Data Platform
9	Research On ADs Monitoring Technology Based On Audio Match
10	Audio Retrieval Resisting To Speed-Change