Chinese Named Entity Recognition And Shallow Parsing

Posted on:2013-08-19

Degree:Master

Type:Thesis

Country:China

Candidate:F Zhang

Full Text:PDF

GTID:2248330395452740

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

Named entity recognition and shallow parsing are two basic tasks of Chinese shallow analysis, which are the basis of many natural language processing tasks, such as syntactic parsing, information extraction, machine translation and so forth. In recent years, the study of the two basic problems has been widespread concerned.There are many different models and methods have been proposed. However, most of the existing solutions are based on the sequence labeling models and methods. One of disadvantages of sequence labeling method is that many useful characteristics of segment-level can not be used. So the expression ability of model is constrained. Based on the present study, this thesis present a novel approach which joints segmentation and labeling (Hereinafter referred to as the "joint learning algorithm") to named entity recognition and shallow parsing.The work of this thesis can be summarized as two aspects:(1) This thesis proposes a joint learning algorithm for Chinese named entity recognition. We use perceptron and beam-search for the novel approach with joint identification and categorization model than performs the two subtasks simultaneously:boundary identification and entity categorization, together with segmentation. This method makes the information can be shared between each sub task, and permits characteristics of segment-level being used directly. The joint model can achieve a better result for named entity recognition.(2) We present two approaches to shallow parsing:the joint learning algorithm and CRF together with transformation-based error-driven learning method. We formulate phrase chunking as a joint segmentation and labeling task in the joint learning model. We implement the joint learning model through the use of perceptron and the beam-search. In addition, this thesis proposes a method for shallow parsing on the basis CRF and transformation-based error-driven learning. First, we use CRF model to identify chunks. Then we acquire candidate transformation rules by error-driven learning from chunking results of CRF.A evaluation function uses to filter candidate transformation rules. After this, we use transformation rules to revise the chunking results of CRF.The experimental results show that the methods we have proposed are effective on named entity recognition and shallow parsing, outperforming current methods.

Keywords/Search Tags:

Named entity recognition, Conditional random fields, Perceptron, Beam-search, Shallow parsing, Transformation-based error-driven learning

PDF Full Text Request

Related items

1	Named Entity Recognition Based On Conditional Random Fields Chinese Research
2	Recognition Of Named Entity In Electronic Medical Records Based On Cascaded Conditional Random Fields
3	Chinese Named Entity Recognition Based On Conditional Random Fields
4	Named Entity Recognition Based On Conditional Random Fields
5	The Research Of Conditional Random Fields Based Chinese Named Entity Recognition
6	Research On Algorithm And System Implementation On Named Entity Recognition For Chinese Electronic Medical Records
7	A Cambodian-named Entity Recognition Study Based On Constrained Random Fields
8	Chinese Named Entity Recognition Based On Conditional Random Fields
9	Research Of Web Text Named Entity Recognition Based On Conditional Random Fields
10	Research On Chinese Named Entity Recognition Based On Rules And Conditional Random Fields