Font Size: a A A

Analysis And Research Of Probabilistic CYK Algorithm

Posted on:2011-03-28Degree:MasterType:Thesis
Country:ChinaCandidate:P C LiFull Text:PDF
GTID:2178360305471498Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
This paper briefly introduces the classification of words and part-of-speech tagging ,then introduces a computer model of syntactic processing: context-free grammar. Context-free grammar which is aslo referred to as phrase structure grammar, is the most commonly mathematical system used to simulate the composition structure of natural language .Syntactic analysis is based on context-free grammar to identify sentences and assign sentences'structure. Some sentences have more than one structure,that is to say ,these sentences are ambiguous. Context-free grammar can effectively express ambiguity,but it cann't resolve the ambiguity. Probabilistic context-free grammar can provide a solution to this problem: select the one whose probability is maximum from the ambiguity. As ambiguity is so common,probabilistic parsing is very significant to syntactic processing and natural language processing.In the existing parsing algorithm, probabilistic CYK algorithm is a standard algorithm for probabilistic parsing. It has the following characteristics:1. The probability context-free grammar required by probabilistic CYK algorithm must be a Chomsky normal form.Rules of Chomsky normal form take two forms: Aâ†'BC or Aâ†'x( A, B and C are non-terminal symbols,x is a non-terminal symbol or terminal symbol). A context-free grammar can be transformed into a weak equivalence of the Chomsky normal form grammar,so probabilistic CYK algorithm can be applied to any Probabilistic context-free grammar.2. Probabilistic CYK algorithm is essentially a bottom-up parsing, using breadth-first search strategy. 3. Probabilistic CYK algorithm is a dynamic programming algorithm. This algorithm is carried out in parallel, there is no backtrack and redundant operation, the time complexity is O(n 3).Based on the bottom-up parsing characteristics of probabilistic CYK algorithm, the traditional sequencing of grammar rules actually constrain the speed of parsing. The search steps in probabilistic CYK algorithm is reduced and the probabilistic CYK algorithm is improved by rearranging rules of a given traditional grammar and establishing a index table. Finally we complete the improved probabilistic CYK algorithm ,and parse some sentences to validate the approach.
Keywords/Search Tags:syntactic parsing, context free grammar, probabilistic context free grammar, probabilistic cyk algorithm
PDF Full Text Request
Related items