Research On Chinese Dependency Parsing Based On Statistical Methods

Posted on:2012-01-07

Degree:Master

Type:Thesis

Country:China

Candidate:X N Ren

Full Text:PDF

GTID:2248330371958263

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

Parsing is one of the core subjects of natural language processing. The goal of dependency parsing is to derive the syntactic structure of a sentence automatically according to the dependency grammar. Dependency grammar is easy to understand, annotate and use, and dependency parsing has more widely applicable to other tasks, such as relation extraction, machine translation, ontology construction and semantic role labeling and so on.There are two kinds of parsing methods, one is based on rules and the other is based on statistics. The former predominated in the earlier research. However, the later has begun to reach out in middle 1980s for the shortcomings of the former. Since 1990s, the later has predominated due to easy access of language resources. This paper investigated the techniques of Chinese dependency parsing by the statistical learning methods based on the corpus mainly in the following three points:First, many Chinese Treebanks are annotated in the form of phrase structure leading to many shortages of Chinese dependency Treebanks. Therefore, many researchers at home and abroad have tried to transform the phrase structure Treebank into dependency Treebank, which needs to find the head of the semantic constituents firstly. This paper proposed a method combining rules and statistics based on cascaded conditional random fields to improve the precision of head finding, which is in favor of the transformation between Treebanks.Second, the recognition of Chinese long sentence dependency relationship is one of the difficult problems in Chinease dependency parsing and is one primary factor affecting the effect of Chinese dependency parsing. A Chinese long sentence can be divided into two sub sentences with the predicate head, so that the difficulty of parsing can be reduced. This paper proposed an effective approach to automatically recognize predicate head in Chinese sentences based on statistical pre-processing and rule-based post-processing for further analysis of long sentence dependency relationship.Third, Chinese dependency parsing can be completed by two steps, which are the recognition of dependence arc and relationship. The recognition of dependency arc can be regarded as the classification of phrase pairs. An efficient searching algorithm base on dynamic programming is proposed to improve the efficiency of searching. On this basis, the method is combined with an MST parser to improve the accuracy of dependency arc analysis. The recognition of dependency relationship type can be regarded as the task of classification. The analysis and comparison of experimental results verify the effectiveness of the method.The first two divisions of the paper serve for dependency parsing. Each division solves a difficult problem in Chinese dependency parsing at different levels. The last division of the paper mainly makes some theoretical and practical explorations into the Chinese dependency parsing.

Keywords/Search Tags:

Chinese Dependency Parsing, Dynamic Programming Algorithm, Head Constituent, Predicate Head of Chinese Sentence, Multi-strategy Combination

PDF Full Text Request

Related items

1	Research On The Recognition Of Chinese Predicate Head Words Based On Neural Network
2	Study And Application Of Chinese Sentence Structure Clustering
3	Study Of Chunk System Oriented To Sentence Parsing
4	Reaserch On Dependency Parsing Of Chinese Simple Sentence
5	Study On Chinese Constituent Parsing
6	Chinese Dependency Parsing Based On Deep Learning
7	Research On Automatic Recognition Of Sentence Patterns Of Modern Chinese
8	Research On Graph-based Chinese Dependency Parsing
9	Research On Chinese Dependency Parsing With High-Performance
10	Research On Mongolian Dependency Parsing Based On The Conversion Of Chinese-Mongolian Dependency Parsing Tree