The Research Of Applying Conditional Random Fields To Chinese Word Segmentation And Part-Of-Speech Tagging

Posted on:2009-11-27

Degree:Master

Type:Thesis

Country:China

Candidate:C Y Yu

Full Text:PDF

GTID:2178360242974992

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

As rapid progress of information technology people hope to intercourse with computer in natural language as human use. Natural language understanding is an interesting and challenging task. From the view of computer science especially artificial intelligence, the task of natural language understanding is to build one computer model which can understand ,analyze and answer question as human usually do.Chinese natural language processing is the core technology in enable computer to understand Chinese. The Chinese syntax parsing is an important problem of the domain of Chinese information processing, which can also promote the development of other related linguistics.The kernel work of this article can be generalized to three aspects as follows:(1) This paper introduces significance of it on natural language the rules of Maximum Entropy and the understanding research. Furchermore, this dissertation discusses the definition of Condition Random Fields heavily motivated by the principle of maximum entropy. Condition Random Fields model relaxes the strong independence assumptions which generative model must have, such as Hidden Mtirkov Model, and overcomes the label-bias problem exhibited by Maximum Entropy Markov Model and other non-generative models.(2) Comparisons and synthesis are drawn from some existed algorithms and models about the Chinese word segmentation and Part-Of-Speech Tagging. Based on the existed research theories, compared to traditional several kind of models, and used Chinese word segmentation method based on the condition random field, which have enhanced the precision of analysis.(3) According to the peculiarity of Chinese word segmentation and the feature used in the Condition Random Fields, determined a set of characteristic template based on Condition Random Fields and expounded the word segmentation statistics about ambiguity words and undocumented words in particular. We analyzed, designed and achieved a module of Chinese word segmentation and Part-Of-Speech Tagging based on Condition Random Fields model.

Keywords/Search Tags:

Natural language processing, Chinese Word Segmentation, Part-Of-Speech Tagging, Condition Random Fields

PDF Full Text Request

Related items

1	Research On The Learning Of Integrating Chinese Word Segmentation With Part-of-Speech Tagging And Domain Adaption Approach
2	Word Segmentation And Pos Tagging In Chinese
3	Chinese Word Found Its Part Of Speech Tagging
4	Research On Parallel Corpora-based Unsupervised Part-of-speech Tagging For Chinese
5	Study On Disambiguation Algorithm For Chinese Word Segmentation
6	Research Of Chinese Word Segmentation With Conditional Random Fields
7	Research On Chinese Word Segmentation And Part-of-speech Tagging Based On Deep Learning Methods
8	Research On The Methods Of Automatic Correction Of Chinese Word Segmentation And Part-of-Speech Tagging
9	The Effect Of Part Of Speech On Chinese Word Segmentation
10	Research And Application Of Chinese Word Segmentation Based On Conditional Random Fields