Font Size: a A A

Named Entity Recognition Based On Conditional Random Fields

Posted on:2014-01-16Degree:MasterType:Thesis
Country:ChinaCandidate:R X QiFull Text:PDF
GTID:2248330398972332Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
Named entity recognition is the basic work and key technology of natural language processing. As the information on Internet is becoming more diversified and complicated, how to extract the most important information becomes a very essential question. Named entity recognition technology is now being applied to Mechanical Translation, Information Extraction, Information Retrieval and other relevant domains.This paper is based on the characteristics of named entity and we have mainly researched the problems of recognition on the name of a person, a location or an organization. Furthermore, we implement a named entity recognition system based on conditional random fields (CRFs). This paper is organized as follows:Firstly, we make a brief analysis on the classification, characteristics and technical difficulties of NER in the first three chapters. The various methods applied in NER are investigated as well. As the Hidden Markov Model and the Maximum Entropy Markov Model are involved in the problems of uncertainty, we introduce the conditional random fields, and expound the significant procedure of CRFs model in detail.Secondly, in the experimental section, we realized a NER system which achieved the module of training and evaluation, and give an introduction of the system structure. We propose a new method on the choice of feature template in order to improve the effect of NER. To demonstrate the advantages of the proposed method, we designed several groups of experiments based on CRFs, which analyze the results of NER from following factors:the size of training dataset, the choice of feature template, the recognition of different language. According to the best experimental feature template, a comparison of the performances of CRFs and MEMM has shown that CRFs is a better measure on NER.Finally, our experimental results provide to prove the validity of the models and their corresponding approaches mentioned in the paper. In accordance with the training and testing data in this paper, the precision, recall rate and F-measure have a good performance.
Keywords/Search Tags:named entity recognition, conditional random fields, maximum entropy, feature selection
PDF Full Text Request
Related items