Font Size: a A A

Application Research On Chinese Named Entity Recognition Based On Domain Ontology

Posted on:2012-10-14Degree:MasterType:Thesis
Country:ChinaCandidate:W L ChangFull Text:PDF
GTID:2178330335952716Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
As a critical role in many Natural Language Processing (NLP) applications, such as Information Extraction, Information Retrieval and Machine Translation etc, Named Entity Recognition (NER) remains a challenging task. In foreign countries, the earliest NER is based on English. However, with the development of NER technologies, a growing number of researchers have been paying more attention to Chinese NER gradually. And Chinese entity's characteristics cause its recognition to be more difficult than which in English. The thesis reviews and analyzes the methods and technologies of NER at first. Among these methods, the Conditional Random Fields (CRFs) model has obtained better recognition performance than other models in Chinese NER.In order to improve the recognition performance of entity, this thesis combines the statistics-based and rules-based method for Chinese NER, emphasizing the role of domain ontology. Firstly a method that applies object-oriented approach to seven steps is used to construct Notebook domain ontology, and then how we choose the effective features based on CRFs model is becoming a research focus with the aim to improve the performance of Domain NER. The solution is that the ontology is took as a semantic feature besides word and POS features; finally, the rules are extracted from the domain ontology to recognize general named entities concerned by users, which is an important supplement to the previous recognition result. Thus the whole performance of NER is enhanced.In order to validate the effect of domain ontology in Chinese NER, experiments are made to compare the two kinds of feature templates which are defined. The experiment shows that the precision rate,recall rate and F-measures of the template containing ontology feature is more higher than the common template's, which shows that the ontology plays a important role in Chinese NER. What'more, the effect of the method combining the use of rules and statistics is better than the CRFs-based method. At the same time, the concrete steps of Chinese NER are demonstrated vividly by means of a simple named entity recognition system.
Keywords/Search Tags:Word Segmentation, Part-of-speech Tagging, Conditional Random Fields, Chinese Named Entity Recognition, Domain Ontology
PDF Full Text Request
Related items