Font Size: a A A

Research On Chinese Named Entity Recognition Based On CRF

Posted on:2012-03-27Degree:MasterType:Thesis
Country:ChinaCandidate:F WangFull Text:PDF
GTID:2178330335478269Subject:Detection Technology and Automation
Abstract/Summary:PDF Full Text Request
Named Entity Recognition is to recognize specific entities in text. As the basic information unit of text, Named Entity is essential to the correct understanding of a text. Named Entity Recognition (NER) is a basic task in natural language processing research, which is widely used in machine translation, information extraction, automatic summarization and so on. So how to identify named entity has great theoretical and practical significance.In this paper, firstly, it investigated and summarized the current status of the Name Entity Recognition. And then, it introduced the evaluation strategy for NER, which analyzed the current method of the Name Entity Recognition.Detailed description of the conditional random field model, conditional random field is a statistical machine learning methods, it has good performance in labeling and fragmenting the sequence. Training in the model, we added the part of speech as the external characteristics of training data. The results show that the training corpus in the external features can make up for lack of training scale, to a certain extent, improved the entity recognition.In this article, SIGHAN 2006 MSRA were our corpus. From our research, we could test template and word size from the Named Entity Recognition experience. Through the pattern learning style, it could recognize the named entity in the corpus. We could compare to the result with the true training corpus, which had been manually labeled. As a result, we could gain the result of the experience to analyze the validity and feasibility of the model.Finally, from the result of the experiments ,the NER using CRF is feasible. And some comments about future works are made.
Keywords/Search Tags:Conditional Random Field, Named Entity Recognition, Characteristic Template, Tag-Set
PDF Full Text Request
Related items