Font Size: a A A

Study On Chinese Named Entity Recognition Based On Hidden Markov Model

Posted on:2009-10-16Degree:MasterType:Thesis
Country:ChinaCandidate:L Y ZhaoFull Text:PDF
GTID:2178360245468623Subject:Information Science
Abstract/Summary:PDF Full Text Request
With the advent of the information age and the development of Internet, using natural language as HCI(Human-Computer Interactive) is an inevitable trend. Natural language processing is the growing demands by the depth and breadth. Since 1995 in the MUC-6(Message Understanding Conference) meeting has been proposed for the first time, Named Entity Recognition technology is more and more researchers in natural language processing as many of the key technologies.We study the method of the Named Entity Recognition, analysis the advantages and disadvantages about the rule-based method and the statistical methods .The number of the contextual information and the degree of data smoothing evaluation is two important parameters in recognition. We proposed a model of third-order HMM and restrict the use of language knowledge in the name of Chinese Entities Recognition against the limited access to the shortcomings of contextual information. The method balanced the accuracy and recall, access to a better recognition results. Automatic Segmentation and POS(Part-of-speech) tagging directly impact on the named entity recognition. This paper used the Massive Intelligent Segmentation system to the text word segmentation and Tagging. We used the improved K-mean method to estimate the parameters and the margin linear method for smoothing parameter results in frequency statistics. Thus, its composition will be named the probability of different entities. in the area of Named entity recognition, we use third-order Hidden Markov Model: Improve the Viterbi Algorithm. On the initial observation sequence re-tagging, the state obtained the best sequence. This paper main identify the names, places and agencies. At present, the trial is still in early stages and need a lot of work to do. In future work, we further study the formulation of rules and data smoothing technology with a view to further improve the named entity recognition rate.
Keywords/Search Tags:named entity recognition, hidden markov model, viterbi algorithm, data smoothing technique
PDF Full Text Request
Related items