Font Size: a A A

A Study On Chinese Personal Name Recognition Based On Conditional Random Fields

Posted on:2011-11-06Degree:MasterType:Thesis
Country:ChinaCandidate:D L WangFull Text:PDF
GTID:2178330332961266Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Chinese Personal Name Recognition (CPNR) plays an important role in Named Entity Recognition (NER) task; it is usually used in information retrieval, information extraction and machine translation and so on. Chinese personal names account for a large proportion in named entities, and it is always a difficulty of Chinese Natural Language Processing (CNLP) due to complexity of construction and diversity of form.This paper based on the previous works of others, completes this task with CRFs model. In order to improve the performance of our system, we introduce the proliferation based on discourse. The main works of this paper are as follows:(1) Give a detail description of CRFs model, and compare this model with other machine-learning models. CRFs model is a very excellent conditional probability model. It not only overcomes the independence assumption of generation models, but also settles the label-bias problem of directed graph models. It inheritances advantages from both type of models in addition.(2) CPNs maybe appear many times in the same corpus, but have different context information. The CPNs which have strong context information are sure more easy to recalled than the others. Based on discourse, this paper constructed a dictionary with personal names extracted from the results of CRFs model. In order to improve the performance of our system, we implement a second recognition of personal names.The research of this paper can also be provided to recognize Chinese location names and organization names. Experimental results prove that our method is effective.
Keywords/Search Tags:Conditional Random Fields, Named Entity, Discourse
PDF Full Text Request
Related items