Font Size: a A A

Research Into Chinese Names Entity Recognition Based On The Maximum Entropy Model

Posted on:2006-07-26Degree:MasterType:Thesis
Country:ChinaCandidate:Y QiaoFull Text:PDF
GTID:2168360155456979Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Named entity recognition is one of the research focuses in natural language processing all along. It plays a very important role in research of information extraction, QA system and machine translation. Even though technology of named entity recognition has reached a high level, but there is a long way to go for the use of Chinese named entity recognition according to the result of evaluating, for the reason that there is some problems in organic combination of technology, resource and application requirement.Chinese names recognition is a sub-question of Chinese named entity recognition. At present domestic research about Chinese names recognition limits itself either to simply recognition of the Han nationality names or to simply recognition of translated names. Research about recognizing the Han nationality names and translated names at the same time is scarcely and it can not meet the requirement of application.This thesis exploringly constructs a Chinese names recognition system which is based on the Maximum Entropy model and can recognize all kinds of names at the same time. We have got a relatively good result with our system. This thesis mainly focuses on the problems as following:1. Features extracting and selection. This thesis put forward features suitable for Chinese names on the base of analysis of real corpus, and we do some experiments to keep effective features.2. Candidate names extraction. This thesis puts forward a kind of smoothing mechanism on the basis of traditional statistical information, which guarantees recall rate of candidate names extraction reach 99%; at the same time we establish flexible rules for selecting tlireshold and improve precision of extraction. The method here not only could recall names...
Keywords/Search Tags:Chinese names recognition, Maximum Entropy model, Features, Candidate names
PDF Full Text Request
Related items