Font Size: a A A

Research On Method Of Chinese Named Entity Recognition Based On Maximum Entropy Model

Posted on:2009-09-20Degree:MasterType:Thesis
Country:ChinaCandidate:H YangFull Text:PDF
GTID:2178360272980485Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Named entity recognition is the subtask of information extraction, which is also basic technology in many nature language processing applications such as machine translation and question answering, etc. However, because of the limitation of Chinese itself, Chinese named entity recognition is very difficult. In order to advance other Chinese nature language processing technologies and applications, researching on Chinese named entity recognition is very significant and important.In this paper, maximum entropy model is used to recognise Chinese named entity. We studied on performance of named entity recognition over different feature template set, and character of maximum entropy model in Chinese named entity recognition. Maximum entropy model can't combine feature automatically, and performance of model depends on feature template significantly. The key of Chinese named entity recognition based maximum entropy model is designing reasonable feature template.There are many connotative semantic features in Chinese which can help Chinese named entity recognition. Moreover, one of the important strongpoint of maximum entropy model is that it can syncretize features in different granularity and levels. With that in mind, many Chinese named entity semantic knowledge bases were established by extracting information from corpus in this paper. However, because of the limitation of corpus' size and data sparse which occurs universally in statistic-based method, much significant information can't be extracted. In order to resolve this problem, in this thesis the idea of semantic expansion is first applied in named entity recognition field, the method fully plays the role of the finite language resource and mines the information and knowledge of finite resource, and can mine more plentiful knowledge at premise of no extending corpus and thus relieve the data sparseness problem at the certain degree. It is validated by experiment that relative to using unexpanded knowledge base average recall is increased by 1.17%, and F value is increased by 0.41%. Especially, the precision, recall and F value of complicated organization name recognition is increased by 0.24%, 1.39% and 0.86% respectively.
Keywords/Search Tags:Chinese Named Entity, Maximum Entropy Model, Feature, Named Entity Indicator, Semantic Expansion
PDF Full Text Request
Related items