Font Size: a A A

A Study Of Content Based Massive Music Retrieval Technology

Posted on:2014-01-23Degree:DoctorType:Dissertation
Country:ChinaCandidate:Q WangFull Text:PDF
GTID:1228330401963071Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
With the increasing capabilities of data storage and transmission applications, the amount of music data has kept unprecedented growth these years. Unexpectedly, the proliferation of music content has made it more difficult to find musical pieces from such vast music data. Recently the dilemma has motivated a remarkable research focus on how to find the required music accurately and quickly from a large-scale music database. The paper studies content based music information retrieval, expecially query-by-humming/singing (QBSH) and query-by-example (QBE), as well as the key technology of fast search from a massive database. The main contributions and innovations are described as follows:1) Local alignment for QBSHIn the previous work of QBSH, the query has been considered to be a fragment of the music, so the task of QBSH has been to find a sub-fragment, which is most similar to the whole query, from the database. Taking into account humming errors, we assume that only part of the query is a sub-fragment of the music. Based on this assumption, the paper proposes a local alignment framework which searches for the best match local sub-fragment between the query and music in the database. In the proposed framework, QBSH is regarded as the identification of common local sub-fragment, which is robust to humming errors. It can discard serious humming errors, reducing the negative impact of humming errors and improving the retrieval accuracy.2) Note and pitch based locality sensitive hashing filtering for QBSHIn the past, researchers adopted pitch based locality sensitive hashing (LSH) algorithm to improve the retrieval speed in a QBSH system. The paper puts forward the note based LSH algorithm to increase the recall rate of candidate songs, and then employs key transposition recursive alignment (KTRA) algorithm to improve the retrieval accuracy. Taking into account the efficiency, the paper proposes confidence based two layers of filter. If the results of note based LSH filter are unreliable, the query will be put into the pitch based LSH filter to search for the more accurate candidates. The strategy of two layers of filter greatly decreases the average retrieval time.3) Tempo based multi-layer filtering and progressive filtering for QBSHIn a QBSH system, most humming tempos are close to the original music tempo, so the humming tempo is an important factor to measure the matching degree of a song. The paper proposes a multi-layer filtering method based on tempo variation to improve the retrieval speed. We first use the original humming clip to search for candidates and then adjust the tempo to search for more candidates. Since the humming tempo reflects the matching degree of a song, the paper presents a fusion of tempo and KTRA based progressive filtering algorithm. We first adopt fast but inaccurate algorithms to reduce the number of candidate songs, and then make use of slow but accurate algorithms to calculate the similarity of candidates, e.g. KTRA. Finally, we fuse the score of tempo and KTRA, and sort all the candidate songs. The tempo provides extra information for melody match, so the fusion strategy improves the retrieval accuracy.4) Entropy based locality sensitive hashing and boundary expanding locality sensitive hashingOne of the key problems in content based music information retrieval is fast search from a large-scale database. The paper studies one of the most popular and fastest search algorithms, namely Locality Sensitive Hashing (LSH), and proposes two improved algorithms: entropy based LSH and boundary expanding LSH. When choosing hash functions, the original LSH algorithm does not consider the actual data distribution. In fact, the distribution of data is not uniform, so some hash functions map points to concentrated values and others map points to discrete values, leading to diverse collision probability. Taking into account uniform projection, the paper proposes entropy based LSH, making the projection uniform and the number of points in different buckets almost the same. In the LSH algorithm, neighbor points are likely to be mapped to two sides of the boundary of adjacent buckets, so points in adjacent buckets are likely to be neighbor points too. Based on the above description, the paper proposes boundary-expanding LSH. The main idea of this algorithm is that the boundary of each bucket is expanded so that adjacent buckets have common region and each point may be mapped to multiple adjacent buckets, significantly increasing the collision probability of neighbor points.5) Structural fingerprint based two layers of filter for QBEIn the QBE system, the results should be accurate and the retrieval speed should be fast. On the basis of Shazam algorithm, the paper proposes a method to construct the structural fingerprint, which uses a plurality of peaks to construct the music fingerprint, increasing the information and discrimination of fingerprints and improving the retrieval speed. In order to improve the retrieval accuracy, the paper presents two layers of filter to obtain more candidates and makes use of the original peaks to calculate the similarity of candidates. The proposed algorithms can improve the retrieval accuracy and speed simultaneously.
Keywords/Search Tags:music information retrieval, query by humming, localitysensitive hashing, query by example
PDF Full Text Request
Related items