Research On Method Of Chinese Named Entity Recognition Based On Maximum Entropy Model

Posted on:2009-09-20

Degree:Master

Type:Thesis

Country:China

Candidate:H Yang

Full Text:PDF

GTID:2178360272980485

Subject:Computer application technology

Abstract/Summary:

Named entity recognition is the subtask of information extraction, which is also basic technology in many nature language processing applications such as machine translation and question answering, etc. However, because of the limitation of Chinese itself, Chinese named entity recognition is very difficult. In order to advance other Chinese nature language processing technologies and applications, researching on Chinese named entity recognition is very significant and important.In this paper, maximum entropy model is used to recognise Chinese named entity. We studied on performance of named entity recognition over different feature template set, and character of maximum entropy model in Chinese named entity recognition. Maximum entropy model can't combine feature automatically, and performance of model depends on feature template significantly. The key of Chinese named entity recognition based maximum entropy model is designing reasonable feature template.There are many connotative semantic features in Chinese which can help Chinese named entity recognition. Moreover, one of the important strongpoint of maximum entropy model is that it can syncretize features in different granularity and levels. With that in mind, many Chinese named entity semantic knowledge bases were established by extracting information from corpus in this paper. However, because of the limitation of corpus' size and data sparse which occurs universally in statistic-based method, much significant information can't be extracted. In order to resolve this problem, in this thesis the idea of semantic expansion is first applied in named entity recognition field, the method fully plays the role of the finite language resource and mines the information and knowledge of finite resource, and can mine more plentiful knowledge at premise of no extending corpus and thus relieve the data sparseness problem at the certain degree. It is validated by experiment that relative to using unexpanded knowledge base average recall is increased by 1.17%, and F value is increased by 0.41%. Especially, the precision, recall and F value of complicated organization name recognition is increased by 0.24%, 1.39% and 0.86% respectively.

Keywords/Search Tags:

Chinese Named Entity, Maximum Entropy Model, Feature, Named Entity Indicator, Semantic Expansion

Related items

1	Based On Maximum Entropy Model Of Chinese Named Entity Recognition
2	The Research On Named Entity Recognition In Chinese Information Processing
3	Chinese Nested Named Entity Recognition Research
4	Research On Product Named Entity Recognition And Normalization
5	Research On Extracting Chinese Entity-relationship Based On Maximum Entropy Model
6	Research On A Two-Stage Method For Chinese Named Entity Recognition
7	Chinese Named Entity Recognition With A Hybrid-Statistical Model
8	Research On Chinese Named Entity Recognition
9	A maximum entropy approach to named entity recognition
10	Research On Named Entity Recognition And Disambiguation Based On Network Semantic Resource