Font Size: a A A

Research On Stable Humming Feature Extraction

Posted on:2018-07-01Degree:MasterType:Thesis
Country:ChinaCandidate:Y Y BianFull Text:PDF
GTID:2348330518493485Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
Query-by-humming is a kind of content-based multimedia retrieval technology, which is one of the research hotspots in information retrieval field. Feature extraction is one of the key technologies in query-by-humming, and it is also the focus of this paper. The main reason for this problem is the singularity of the human voice, which includes the humming tone difference, the humming range difference, and the rhythm difference. In these three problems, the methods proposed in previous studies, such as the histogram of local pitch histogram, have been able to solve the problem of different tone and range. However, the problem of humming rhythm variation has not been obtained an effective solution.This paper aims to solve the problems of humming feature instability,especially the humming rhythm changes, and conducts the following research, hoping to extract from the melody of essential and stable information to further enhance the stability of the humming feature.1. Improve the local statistical feature extraction methodThe local pitch statistical feature is obtained by projecting the pitch to the longitudinal range, and obtaining the characteristic of relatively stable tone and range. On this basis, this paper improves the algorithm by introducing a projection weight assignment method based on interval position. And in order to solve the problem of unsteadiness caused by rhythm variation, the rhythm statistical feature is put forward as a supplement. Firstly, the concept of basic rhythm and estimation algorithm are put forward. Then, the note length of rhythm segment is regularized using basic rhythm, and then the rhythm sequence is projected and statistically obtained to keep the characteristics similar to the local histogram of pitch histogram. The experimental results show that the combination of pitch feature and rhythm feature can effectively improve the feature stability and distinguishability, and obtain a good robustness to the difference in the tone, range and rhythm.2. Propose the melody extreme feature extraction methodThe local statistical 'feature of pitch and rhythm solve the problem of humming feature extraction to a certain extent, but humming rhythm changes not only affect the stability of feature itself, but also affect the feature extraction units. To solve this problem, this paper proposes a feature extraction unit selection method based on the extreme points of the melody. However, the extraction of melody extreme points may be incorrect, so we further optimize the extraction method of melody extreme points and design robust feature structure. The extreme points of melody have strong robustness to the changes of humming rhythm.Defining the basic feature units by the extreme points of melody, we can make the feature extraction units itself robust to rhythm change. At the same time, analyzing the melody structure by the extreme points of the melody, can greatly reduce the index quantity compared with the previous method of linear expansion and sliding window, thus saving computing resources, reducing the retrieval time consumption, and improving the real-time performance of the humming retrieval system.Finally, the effectiveness of the proposed method is verified by experiments. In the music library containing 5000 MIDI songs, 1153 humming clips were used to query. In the locally-sensitive-hashing based humming retrieval experiment, the top1 accuracy rate of the algorithm is 88.6%, top5 accuracy rate 92.8%, MRR and the average retrieval time consumption is 1.92s. Compared with the exhaustion retrieval system based on the linear scaling and sliding window, it can reduce the retrieval time consumption greatly, while guaranteeing an acceptable retrieval accuracy. The effectiveness of the proposed algorithm.
Keywords/Search Tags:query-by-humming, feature extraction, local statistics, melody extremum, melody segmentation
PDF Full Text Request
Related items