Research On Chinese Integrated Parsing Model

Posted on:2005-09-05

Degree:Master

Type:Thesis

Country:China

Candidate:Y D Chen

Full Text:PDF

GTID:2168360155971974

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

As the development of computer and application of internet, the large quantity of text cannot be processed in handwork. The goal of national language understanding (NLU) is to process the text in fast speed and high quality. In this paper, based on researching the ambiguities in each step in NLU, an integrated parsing model is put forward. The nature of this model is transferring the ambiguities which can not be processed in current steps to the next step under the condition that disambiguation as possible here, and a semantic strategy is designed to decrease the complexity of syntactic analysis.The disambiguation is the principal task of the NLU. This paper researches various ambiguities in NLU and relative approaches. Comparing with the several integrated approaches, we put forward to a new integrated parsing model. The principles of the model are transferring the ambiguities which can not be processed in current step to the next step and disambiguation as possible in current step in order to decrease the complexity of next step.The word segmentation is the first step in NLU, and the quality of segementation influences the next step. We firstly define the sentence coverage rate and word coverage rate. And then, a Based on Directed Graph Bi-directed Maximum Match is designed. We present the feasibility of this segmentation algorithm. The advantage of it is reserving the ambiguities by several sequences. Comparing the classic rule-based approaches and Omni-segmentatin, the algorithm obtains the high coverage rate in low complexity.In tagging, we modify the Viterbi algorithm because it ignores the tagging ambiguities. A strategy of transferring probability between tagging and parsing is provided to implement the parallel in tagging and parsing. By transferring the probability backward, the ambiguities in tagging are all reserved, on the other hand, a context provided by probability is used to syntactic analysis. We illustrate the advantage by an example.It would be a heavy load in parsing because of receiving ambiguities corned from other steps. We combine the semantic strategy with the integrated parsing model so as to cut the irrational syntactic trees. By semantic tagging and semantic matching between words, the semantic strategy implements word sense disambiguation.Finally, we sum up the integrated parsing model and present its disadvantages, then point out the future resarch direction.

Keywords/Search Tags:

The integrated parsing model, Bi-Directed Maximum Match, Hidden Markov model, Viterbi algorithm, Probability- based Generated LR algorithm

PDF Full Text Request

Related items

1	Viterbi Algorithm: Analysis And Implement
2	HMM-based Chinese Part-of-Speech Tagging And Improvement
3	Detection Of Cell Division Sequence Based-on Hidden Markov Model
4	Study On Chinese Named Entity Recognition Based On Hidden Markov Model
5	Research Of Web Text Mining Technology Based On Hidden Markov Model
6	The Algorithm Research Of Chinese Information Extraction Based On The Hidden Markov Model
7	Markov Model-Based Sentence-Level Input Method Algorithm Prototype Design And Implementation
8	Chinese Address Name Recognition Algorithm Design And Implementation
9	Research On Multiresolution Hidden Markov Model For Image Denoising
10	Research On Recommendation Algorithm Based On Hidden Markov Model