Font Size: a A A

Research On Chinese Named Entity Recognition

Posted on:2007-08-27Degree:MasterType:Thesis
Country:ChinaCandidate:X T LiaoFull Text:PDF
GTID:2178360185985937Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Chinese Named Entity (NE) recognition is to recognize specific entities in text. It is the key technique in many Chinese information processing applications such as information extraction, machine translation, question answering, etc. But because of the limitation of Chinese itself, Chinese Named Entity recognition is very difficult. In order to advance the development of other technologies and applications, working on the technologies of Chinese named entity identification is significant and necessary.At present, rule-based and statistics-based are two main techniques in Chinese Named Entity recognition field. And statistics-based technique is based on statistic models which can be divided into generative model and conditional model. This paper is aimed at discussing several methods about Chinese Named Entity recognition, and analysis the difference among them. So, we introduced four kinds of Chinese Named Entity recognition techniques in the paper, including rule-based method, Hidden Markov Model (HMM), Maximum Entropy (ME) and Conditional Random Fields (CRF).When we use rule-based method to recognize Named Entity, we established different rules according to the characteristic of every NE type, which can well describe the inside and outside features about NE, and obtain high precise. But it is very difficult to build a rule set and make it easily moved or adapted for use in another system. It is too expensive.HMM, ME and CRF are all statistic models. HMM is a typical generative model. ME and CRF both belong to conditional model. We prove that CRF is the best technique in NER by analyzing the results of experiments about the three models. In addition, we have made an intensive study of ME model. We record the change of performance in the circumstance of different tag set or different feature templates or adding language features. And use a layered structure ME model to improve the performance of organization in the end.In a word, Chinese Named Entity recognition is one of the most important problems in Natural Language Processing. This paper has made some valuable...
Keywords/Search Tags:Chinese Named Entity, Rule, Maximun Entropy, Conditonal Random Fields, Hidden Markov Model
PDF Full Text Request
Related items