Font Size: a A A

Research On Chinese Named Entity Recognition Based On Rules And Conditional Random Fields

Posted on:2016-01-02Degree:MasterType:Thesis
Country:ChinaCandidate:Z G ChengFull Text:PDF
GTID:2308330464972621Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The task of Named entity recognition is to automatically recognize the specified entity in a document. It is a basic work in natural language processing, which is widely used in Information extraction, information retrieval, machine translation and automatic question answering. Therefore, researching on named entity recognition has important theoretical significance and practical value. This paper recognize the Chinese Named Entity based on the conditional random fields model (CRF) and rulesThis paper makes a deep analysis of the technology of named entity recognition in recent years at home and abroad by analyzing the current research status of named entity recognition. After the analysis of the method based on rules, the hidden Markov model, the maximum entropy model and conditional random field, we proposed the research scheme based on Rules and conditional random field model.This paper identify numerals and dates based on rules because numerals and dates are regulatory, and it identifies names of organization, person names, and place names with conditional random fields because of their irregularity. This paper separates text on the basis of a single word in order to obtain context characteristics and more text information in the named entity recognition process because several previous researches show it has better performance. Finally, it recognizes the names of organization, person names, and place names with conditional random fields. The paper uses many different templates in the training process and selects the best template to achieve the best results. Main works of this paper are summarized as follows:1. This paper automatically identifies the date and numeral in the People’s Daily News corpus based on rules with the GATE framework open source software from University of Sheffield.2. At the same time, it formulate the corresponding template by comparing different templates and complete the identification of organization names, person names and place names with conditional random fields. Results were compared with right manual mark to evaluate its accuracy rate, recall rate and F value.3. This paper designs a system, the results show that the combination of rules and CRF for named entity recognition can achieve effective recognition.
Keywords/Search Tags:rules, conditional random fields, named entity recognition, feature template
PDF Full Text Request
Related items