Font Size: a A A

Combination Of Speech Recognition Technology, Music Humming

Posted on:2009-11-11Degree:MasterType:Thesis
Country:ChinaCandidate:C LiFull Text:PDF
GTID:2208360242988528Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the rapid developing computer science technique, people don't satisfy with the single retrieval.Many studies focus on how to get the multimedia information we want. Query by humming, as a brand-new music retrieval method, can help user to locate the wanted piece within a huge music repository by simply sing several tones.Score is a senior features, if we can identify accurately, the detection rate will be greatly improved. How to identify signals quickly and accurately from singing a note sequence is a challenging task. This paper will combine speech recognition technology applied to the retrieval of music. And the realization of the topic, often have to use a lot of knowledge system for large-scale QBH system to provide a key practical technologies.In this paper, singing melodies identification of problems, we present a novel algorithm to recognize melody from humming or singing signals. Some researches about this technique are carried out as follows:(1) The theoretics of the speech recognition system is given. After discussing the feasibility of the speech recognition technology and the difficulty of the query by humming system, solution and processing frame are obtained.(2) A brief introduction on the theory of Hidden Markov Model (HMM), which is frequently used in CSR, is presented. In order to put HMM into practical speech recognition applications, three important problems have to be solved. The differences of HMM and DTW are given and the advantages of HMM mode is explaned.(3)This paper presents a novel algorithm to recognize melody from humming or singing signals. Based on the model of HMM, training data and training process of acoustic modle and language modle are described.(4) In the training of the acoustic model, even if the pitch estimation is used,we can not avoid causeding some degradations when doing any hard-decision in voiced-unvoice segmentation and pitch estimation . In order to solve this problem, some feature extraction is studied. The acoustic model was trained based on high-order cepstrum features. Enhance the robustness. At the same time, a key-independent 4-gram language model was also employed to represent musical prior knowledge.(5)Applicating a hummed melody recognition by using statistical methods,we designed and completed a query by humming system.The experimental results were given by both musical note recognition error rate and the end-to-end performance of a Query-by-Humming baseline system which used the algorithm as front-end. Experiment results shown that the proposed algorithm is most robust in the noisy condition, and still close to the best performance on the clean data, and has higher retrieval precision.
Keywords/Search Tags:speech recognition, query by humming, HMM, musical tune recognition
PDF Full Text Request
Related items