Font Size: a A A

Research On Japanese Dependency Parsing Technology

Posted on:2012-10-24Degree:MasterType:Thesis
Country:ChinaCandidate:J ChengFull Text:PDF
GTID:2248330371458229Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Japanese dependency analysis is recognized as a basic technique in Japanese sentence analysis, and it determines the dependency relationship between“bunsets”based on Japanese dependency grammar. Syntax parsing is the primary basis of deep natural language processing such as semantic analysis and is an indispensable part for many natural language processing application systems. Dependency analysis plays an important role in machine translation, information extraction, automatic question answering and other fields.Current related researches on Japanese dependency analysis have focused on changing the learning framework, and machine learning algorithms are used to Support Vector Machines or other boundary-based methods of learning and memory. Conditional Random Fields as an excellent sequence labeler has good performance in sequence labeling. This method has been successfully used in natural language processing tasks and obtained good results. However, there is no relevant reports on Japanese dependency analysis. This paper proposes a new method combining Cascade Chunking Algorithm with Conditional Random Fields into the rich contextual information, to give each unit an optimal labeling result from the point of whole sentence. Experiments on Kyoto University Text Corpus (Version 4.0) show that our method has achieved good results in dependency accuracy and sentence accuracy even without dynamic features.Rule method as a useful complement to statistical methods, is still widely used in many natural language processing fields. The traditional rule method is based on rules hand-written by knowledge engineers according to their experiences and knowledge, and entirely depends on the language knowledge of engineers who develop rules. The creation of rule set needs a lot of manpower and material resources. In order to make up for shortages of traditional rule method, the error-driven technique based on Conditional Random Fields is adopted to parsing again for improving the parsing results. It uses statistical methods to automatically learn the error disciplines and obtain machine identification model via training, the results in the first identification stage of parsing with Conditional Random Fields are used as the features to be added in the feature template in the second stage to learn the error disciplines and correct the errors for the second parsing. Expermental results on the same corpus metioned above show that our method further improves accuracy of dependency analysis.
Keywords/Search Tags:Japanese dependency analysis, Conditional Random Fields, Machine learning, Contextual information, Error-driven technique
PDF Full Text Request
Related items