Font Size: a A A

Chinese Named Entity Recognition And Shallow Parsing

Posted on:2013-08-19Degree:MasterType:Thesis
Country:ChinaCandidate:F ZhangFull Text:PDF
GTID:2248330395452740Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Named entity recognition and shallow parsing are two basic tasks of Chinese shallow analysis, which are the basis of many natural language processing tasks, such as syntactic parsing, information extraction, machine translation and so forth. In recent years, the study of the two basic problems has been widespread concerned.There are many different models and methods have been proposed. However, most of the existing solutions are based on the sequence labeling models and methods. One of disadvantages of sequence labeling method is that many useful characteristics of segment-level can not be used. So the expression ability of model is constrained. Based on the present study, this thesis present a novel approach which joints segmentation and labeling (Hereinafter referred to as the "joint learning algorithm") to named entity recognition and shallow parsing.The work of this thesis can be summarized as two aspects:(1) This thesis proposes a joint learning algorithm for Chinese named entity recognition. We use perceptron and beam-search for the novel approach with joint identification and categorization model than performs the two subtasks simultaneously:boundary identification and entity categorization, together with segmentation. This method makes the information can be shared between each sub task, and permits characteristics of segment-level being used directly. The joint model can achieve a better result for named entity recognition.(2) We present two approaches to shallow parsing:the joint learning algorithm and CRF together with transformation-based error-driven learning method. We formulate phrase chunking as a joint segmentation and labeling task in the joint learning model. We implement the joint learning model through the use of perceptron and the beam-search. In addition, this thesis proposes a method for shallow parsing on the basis CRF and transformation-based error-driven learning. First, we use CRF model to identify chunks. Then we acquire candidate transformation rules by error-driven learning from chunking results of CRF.A evaluation function uses to filter candidate transformation rules. After this, we use transformation rules to revise the chunking results of CRF.The experimental results show that the methods we have proposed are effective on named entity recognition and shallow parsing, outperforming current methods.
Keywords/Search Tags:Named entity recognition, Conditional random fields, Perceptron, Beam-search, Shallow parsing, Transformation-based error-driven learning
PDF Full Text Request
Related items