Font Size: a A A

Research On Ensemble Classifiers For Japanese Dependency Parsing

Posted on:2012-02-23Degree:MasterType:Thesis
Country:ChinaCandidate:J GaoFull Text:PDF
GTID:2218330368488084Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Syntax parsing is the process of generating the syntax tree. There are two methods for syntax parsing, one is dependency based and the other is phrase based. Dependency relations parsing present the relations between words, and they can be easy to be converted into semantic description. This thesis focuses on Japanese dependency relations parsing based on the relations of words, it can be easily implemented compared to that based on the phrase structures.At present, English and Japanese dependency parsing has achieved good research results. There are many methods for Japanese dependency parsing, and Nivre's algorithm and Maximum spanning tree (MST) algorithm are mainly two methods for Japanese dependency parsing. However, none of them has achieved the good performance of all the aspects. Therefore, how to combine the algorithms and make use of the complementary of the algorithms to improve the parsing performance is the main issue that we focus on.In this paper, we first implement the systems for Japanese dependency parsing based on the Maximum spanning tree (MST) algorithm, Cascaded Chunking Model algorithm, Elimination algorithm and Nivre's algorithm. Then, based on the theoretic of ensemble classifiers, we combine the dependency relation algorithm based on graph and transition respectively, and use the combined algorithm for the training and testing. And then, we use the voting technique to combine the Cascaded Chunking Model algorithm, Elimination algorithm and Nivre's algorithm. Finally, we use the ensemble strategy to combine the four algorithms that referred in the above, and we also analyze the voting weight for each algorithm. We compared our system with other Japanese dependency parsing systems, and found that our system is robust.Our training and testing data are selected from the Kyoto University Corpus (Version 4.0). Experiments show that the proposed methods achieve an accuracy of 90.46% and 49% in phrase and sentence respectively.
Keywords/Search Tags:Japanese dependency analysis, Ensemble classifier system, Support Vector Machine (SVM)
PDF Full Text Request
Related items