Font Size: a A A

Automatic Identification Of Chinese Prepositional Phrase Based On Maximum Entropy

Posted on:2007-02-04Degree:MasterType:Thesis
Country:ChinaCandidate:J T YuFull Text:PDF
GTID:2178360212457094Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Prepositional phrase is one of the most important Chinese phrases. The meaning of Prepositional Phrase Identification lies in three aspects. Firstly, it reduces the candidate numbers of Main Verb Identification. Secondly, it simplifies the structure of sentence and makes the parsing easier in the next step. Finally, it benefits the template matching in Example Based Machine Translation.As a key problem of Natural Language Processing, the problems of complete syntactic parsing aren't solved yet. The thesis aims to discuss the methods and techniques of Chinese Prepositional Phrase Identification. Then, the first novel aspect of our work is discussing the semantic, syntax and usage of Chinese Prepositional Phrase, and making a specification for annotating the Chinese Prepositional Phrase from computational point of view is carefully designed based on the related work of Chinese linguists. During the research, Based on Church's idea that BaseNP Identification can be treated as part-of-speech tagging, an effective algorithm is promoted in this paper to identify prepositional phrases in shallow parsing level using these features. The system of Chinese Prepositional Phrase Identification in this thesis adopts a statistical model based Maximum Entropy (ME).In practice, using ME model we can reach high accuracy with knowledge-poor features. Another advantage of ME model is its reusability and the theory of ME framework is independent of any particular natural language task. The selection of features is a key problem of ME model which determines the performance of the Identification. Aiming at the task of Chinese Prepositional Phrase Identification, we proposed that word and part-of-speech are the main factors which construct a feature space of ME model. And an algorithm is presented to automatically acquire a feature set.The results show that the method of our system is efficient for Chinese Prepositional Phrase Identification: in open test, the precision reach 89.1%. Furthermore, this model has a good expandability which can be used to recognize other Phrases such as Base NP and The Longest NP.
Keywords/Search Tags:Natural Language Processing, Shallow Parsing, Prepositional Phrase Identification, Maximum Entropy
PDF Full Text Request
Related items