Font Size: a A A

Research Of Chinese Dependency Analysis Based On Root Node

Posted on:2009-01-21Degree:MasterType:Thesis
Country:ChinaCandidate:Y G YangFull Text:PDF
GTID:2178360278453571Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Chinese dependency analysis is an important way of syntax analysis, and syntax analysis is one of the key technologies in natural language processing. Chinese dependency analysis is based on dependency constraints and can give dependency between words. Word is the smallest element of sentence, and the dependency based on words analysis can represent deep syntax relation, so that this paper researches Chinese dependency between words.Nivre's algorithm has been used for English and Spanish dependency analysis, and has achieved good research results. Furthermore, the syntax structure resembles between Chinese and English, so firstly try to choose deterministic Nivre's algorithm to parse a sentence only deciding whether the current word modifies the word immediately beside it. But it is difficult to parse distant relations with conventional methods. Yang yang has proposed "Deterministic Nivre's algorithm with consideration of long-distance dependency", and has solved the problem. But this method doesn't consider the information of the full sentence, and affects the accuracy of the analyzer.To solve this problem, this paper constructs a root node finder by Preference Learning arithmetic, using the full sentence's information in root node analysis. By advancing the root node accuracy, to advance the dependency accuracy indirectly. Furthermore the root node divides the sentence into two independent sub-sentences. This method not only can reduce the analysis difficulty, but also can avoid dependencies which span the root node. By analyzing two sub-sentences, the system can get the dependency analysis' results of the full sentence by combining their respective results.Experiments using Harbin Institute of Technology Corpus show that the root accuracy outperforms that of previous system by 9.6% accuracy and achieves 81.20%. The dependency accuracy is also improved and achieves 79.44%. Furthermore, the dependency accuracy even achieves 98.62% in the close test.This paper has done error analysis on the above experimental results, and finds the rough Part-of-Speech of Harbin Institute of Technology Corpus have lead to part of analysis errors. To use more exact features for learning, this paper subdivides the Part-of-Speech Tagging System based on Hidden Markov Model. The root accuracy is improved again and achieves 83.90%, and the dependency accuracy is 79.64% in the open test.
Keywords/Search Tags:Chinese dependency analysis, Support Vector Machines, Nivre's algorithm, Preference Learning, Part-of-Speech tagging system conversion
PDF Full Text Request
Related items