Font Size: a A A

Research On Chinese Dependency Parsing Based On Statistical Methods

Posted on:2012-01-07Degree:MasterType:Thesis
Country:ChinaCandidate:X N RenFull Text:PDF
GTID:2248330371958263Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Parsing is one of the core subjects of natural language processing. The goal of dependency parsing is to derive the syntactic structure of a sentence automatically according to the dependency grammar. Dependency grammar is easy to understand, annotate and use, and dependency parsing has more widely applicable to other tasks, such as relation extraction, machine translation, ontology construction and semantic role labeling and so on.There are two kinds of parsing methods, one is based on rules and the other is based on statistics. The former predominated in the earlier research. However, the later has begun to reach out in middle 1980s for the shortcomings of the former. Since 1990s, the later has predominated due to easy access of language resources. This paper investigated the techniques of Chinese dependency parsing by the statistical learning methods based on the corpus mainly in the following three points:First, many Chinese Treebanks are annotated in the form of phrase structure leading to many shortages of Chinese dependency Treebanks. Therefore, many researchers at home and abroad have tried to transform the phrase structure Treebank into dependency Treebank, which needs to find the head of the semantic constituents firstly. This paper proposed a method combining rules and statistics based on cascaded conditional random fields to improve the precision of head finding, which is in favor of the transformation between Treebanks.Second, the recognition of Chinese long sentence dependency relationship is one of the difficult problems in Chinease dependency parsing and is one primary factor affecting the effect of Chinese dependency parsing. A Chinese long sentence can be divided into two sub sentences with the predicate head, so that the difficulty of parsing can be reduced. This paper proposed an effective approach to automatically recognize predicate head in Chinese sentences based on statistical pre-processing and rule-based post-processing for further analysis of long sentence dependency relationship.Third, Chinese dependency parsing can be completed by two steps, which are the recognition of dependence arc and relationship. The recognition of dependency arc can be regarded as the classification of phrase pairs. An efficient searching algorithm base on dynamic programming is proposed to improve the efficiency of searching. On this basis, the method is combined with an MST parser to improve the accuracy of dependency arc analysis. The recognition of dependency relationship type can be regarded as the task of classification. The analysis and comparison of experimental results verify the effectiveness of the method.The first two divisions of the paper serve for dependency parsing. Each division solves a difficult problem in Chinese dependency parsing at different levels. The last division of the paper mainly makes some theoretical and practical explorations into the Chinese dependency parsing.
Keywords/Search Tags:Chinese Dependency Parsing, Dynamic Programming Algorithm, Head Constituent, Predicate Head of Chinese Sentence, Multi-strategy Combination
PDF Full Text Request
Related items