Font Size: a A A

The Research Of Applying Conditional Random Fields To Chinese Lexical Analysis And Chunk Parsing

Posted on:2007-04-27Degree:MasterType:Thesis
Country:ChinaCandidate:H LuoFull Text:PDF
GTID:2178360182998027Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
This dissertation introduces the research of lexical analysis and syntax parsing is important, crucial and fundamental in the research of natural language understanding. According presently the tendency of methods that integrate statistics-based and rule-base methods, this paper introduces the rules of Maximum Entropy and the significance of it on natural language understanding research. Furthermore, this dissertation discusses the definition and parameter estimate of Condition Random Fields. CRFs are probabilistic models for segmenting and labeling sequence data and heavily motivated by the principle of maximum entropy. Then this dissertation presents a unified approach for Chinese lexical analysis using Conditional Random Fields. Precious applications applying conditional random fields to Chinese words segmentation convert segmentation to character-based Begin/Inside tagging. This dissertation presents using the words lattice as the fundamental sequence to be tagged to achieve Chinese lexical analysis. Then the lexicon can be used efficiently, and language knowledge can be integrated easily in feature template selecting. This dissertation also discusses applying Conditional Random Fields to Chinese Chunk Parsing and our future works.
Keywords/Search Tags:Chinese lexical analysis, Chinese Chunk Parsing, Conditional Random Fields, Maximum Entropy, Labeling Sequential Data
PDF Full Text Request
Related items