Font Size: a A A

A Study Of Lyric Recognition-Assisted Music Information Retrieval

Posted on:2014-03-08Degree:DoctorType:Dissertation
Country:ChinaCandidate:Z Y GuoFull Text:PDF
GTID:1268330401463071Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
With the rapid development of digital technique and the population of networks, it becomes very easy to access a large quantity of digital music. At the same time, music information retrieval (MIR), which aims to search music from a large-scale music database, becomes an important and challenging research topic. The recent developed content based MIR systems, which work based on content features, such as melody, rhythm, etc, provide richer music retrieval methods for users, and it has become a very popular research topic.However, most of MIR systems only make use of melody to match music. Since most of users are non-professional singers, it is very likely that the input queries contain melody errors. In this case, MIR systems, which are only based on melody features, may result in a failed retrieval. In fact, lyric, which is not taken into account in such a MIR system, provides additive complementary information for song identification. This paper tries to improve MIR systems by adding lyric. We focues on two key problems:the first one is extracting lyric from spoken or singing queries, and the second one is searching method. The main contributions and innovations in this paper are described as follows:1. Word activation force based language modelsThe problem of sparse data becomes an outstanding issue when constructing n-gram language models for lyric recognition of spoken queries. To improve the lyric recognition accuracy, this paper pays attention to this problem.Class-based language models suggest an appealing way to solve the problem of sparse data, but the performance of class-based language models depends on the word classes. The word activation force (WAF) based affinity measure has been proved to be effective to measure the similarity between two words. In this paper, we first apply the affinity measure to measure the similarity between two words, and then employ normalized spectral clustering to group words into word classes. Based on word classes,we can easily get a class-based language model. At last, we interpolate our WAF-based language model with a classic word-based n-gram model. Experimental results show the effectiveness of such interpolated model.2. A multilayer filter-based searching method for a MR system using spoken queriesThis paper proposes a multilayer filter-based method for searching the target lyric fast and accurately from the lyric database. The proposed method uses multiple hypothesis of recognition output for matching. For each hypothesis, if it is correctly recognized, the level-1filter can fast find the target songs using indexes; while if the level-1filter can not find any "matched" songs, the level-2filtering is performed to pre-select the probable lyric candidates; and then the acoustic similarity between a lyric candidate and its corresponding hypothesis can be calculated using the level-3filtering. Experimental results show the effectiveness of the proposed method.3. A lyric recognition-assisted Query-by-Singing/Humming (QBSH) methodAdding lyric to help QBSH systems is intuitive but challenging. The existing methods use a large vocabulary continues speech recognizer (LVCSR) for lyric recognition of singing queries, but the extracted lyric is inaccurate. This paper proposes a lyric recognition-assisted QBSH method. Before lyric recognition, we first pre-select candidates using melody matching methods; after that, we build a recognition network using the lyrics of candidates; and then, we use the isolated-word recognition technique for lyric scoring; at last, candidates are ranked according to their melody matching and lyric scoring results. In our experiments, a significant improvement is achieved by the proposed method.
Keywords/Search Tags:music information retrieval, lyric recognition, query-by-singing/humming, lyric retrieval, language model
PDF Full Text Request
Related items