Font Size: a A A

The Field Of Music, A Combination Of Rules And Statistical Named Entity Recognition

Posted on:2011-03-12Degree:MasterType:Thesis
Country:ChinaCandidate:X Q ZhangFull Text:PDF
GTID:2208360308466956Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Music is human's emotional sustenance and externalization, so it's an eternal theme of human history. With the rapid development of Internet, people are exposed to immense amounts of music information. How to access music information of one's concern from the Web is urgently addressed. Thus, there is a rising need for automated and effective music information processing tools to assist in music retrieving, music personalized recommendation, music trend analysis and other related studies. Musical Named Entity (called Musical Entity for short) contains singers, musical bands, songs and albums. It is the basic information unit of music information, which is also the key to understand the information. So how to recognize musical entity correctly form the huge number of music information is a very important research, and a basis for other related studies.Musical Entity Recognition is a vertical component of Named Entity Recognition. At present, there have been a lot of works on Named Entity Recognition, especially on the person name, place name and organization name. But the research on Musical Entity Recognition is rare, especially on Chinese song name and album name. Therefore, in order to recognize musical entity accurately, we adopt and improve well-known Named Entity Recognition techniques based on the characteristics in music domain.This paper studied Musical Named Entity Recognition technique to extract musical entities from different Web pages quickly and correctly. In this paper, we mainly carried out the following two tasks:First, we designed a framework of distributed spider, proposed a method for Web information extraction based on DOM, and improved the word segmentation module. They were preparations for Musical Entity Recognition.Second, by analyzing the characteristics of Musical Entity and its context, we presented a hybrid approach based on rules and statistics for Chinese Named Entity Recognition in music domain. Its core idea was as follows: Firstly, we employed a rule-based method to recognize some musical entities with explicit rules in their context before word segmentation. Then after word segmentation, we introduced Hidden Markov Model to identify most of musical entities. Finally, we corrected errors in recognition results using musical entity library and some rules. This approach has the advantages of both statistical and rule-based method. Meanwhile, a novel and convenient training corpus tagging method was proposed, which made Hidden Markov Model practically usable in Musical Entity Recognition.This paper implemented the musical named entitiy recognition system based on the above-mentioned work. The experimental results showed that this system had a higher precision and recall rate. It proved that the hybrid approach presented in this paper had certain research meaning and applied worthiness.
Keywords/Search Tags:Named Entity Recognition, Musical Named Entity, Hidden Markov Model, Training Corpus Tagging
PDF Full Text Request
Related items