Font Size: a A A

Reaserch On Dependency Parsing Of Chinese Simple Sentence

Posted on:2007-05-02Degree:MasterType:Thesis
Country:ChinaCandidate:Q L ZhouFull Text:PDF
GTID:2178360212983695Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Parsing has played an important role in natural language processing, and it is the bridge which integrate word segmentation with semantic analysis. Both rule-based approach and statistics-based approach have their merits. How to combine them so that improve the performance of parsers is the vital subject of computing linguistics. At the same time, ambiguous structure has become the enormous obstacle which restrains the correct rate of parsing.In this paper, the domestic and abroad research background aiming at above-mentioned problems are analyzed. Taking dependency grammar as language processing model, we proposed a strategy which combines rule with statistics for Chinese parsing. This model divides parsing into three stages. The first stage is Chunk parsing, the second stage is analyzing the dependency relation of inside Chunk, and the third stage is analyzing the dependency relation of between Chunks. In each stage, aiming at different characteristics we use rule and statistical method to deal with them. In order to realize the tactics which being mentioned above, we study in several following aspects:1. The study on long-distance dependency in Chinese simple sentence. In any kind of language, the phenomenon of long-distance dependency is ubiquitous. We analysis about sentence structure and semantic of Chinese simple sentence, and on the basis of it we make the dependency patterns which recognize the long-distance dependency relation.2. Structure disambiguation. In this paper we use two methods to disambiguate. (1) The strategy of disambiguating all kinds of ambiguous structure. Ambiguousstructure will appear every stage of parsing. Therefore we propose a strategy of disambiguation aiming at all kinds of ambiguous structure. In order to realize structure disambiguation, we utilize collocation information among words and use a modified t test method to get a "Collocations Coefficient".(2) The specifically disambiguating strategy for special ambiguous structure. Inorder to disambiguate ambiguous structure better, we use a representativeness ambiguous structure which is the "verb phrase + noun phrase + de/auxiliary word + noun phrase" (VNN) as sample to do research and discussion. And we propose two methods which are HowNet and maximum Entropy to do the disambiguation.In this paper, we divide parsing into different stages and carry on the research. We use different methods separately in different stages, because it can reduce the rules conflict and strengthen the pertinence of statistical analysis, thereby improving the correct rate of statistical analysis. However, the key problem what parsing facing is structure disambiguation. We have got a better experiment result by combining the ways of universal disambiguation and special disambiguation on different ambiguous structures. For the typical VNN, correct rate can exceed 80% by the disambiguation method which base on HowNet and maximum Entropy.
Keywords/Search Tags:Parsing, Ambiguous structure, Disambiguation, Long-distance Dependency, Chunking
PDF Full Text Request
Related items