Font Size: a A A

Automatic Identification Of Chinese Prepositional Phrase Based On CRF

Posted on:2009-07-19Degree:MasterType:Thesis
Country:ChinaCandidate:S L HuFull Text:PDF
GTID:2178360272970318Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Prepositional phrase is one of the most important Chinese phrases. The meaning of Prepositional Phrase Identification lies in three aspects. Firstly, it reduces the candidate numbers of Main Verb Identification. Secondly, it simplifies the structure of sentence and makes the parsing easier in the next step. Finally, it benefits the template matching in Example Based Machine Translation.As a key problem of Natural Language Processing, the problems of complete syntactic parsing aren't solved yet. The thesis aims to discuss the methods and techniques of Chinese Prepositional Phrase Identification. During the research, Based on Church's idea that BaseNP Identification can be treated as part-of-speech tagging, an effective algorithm is promoted in this paper to identify prepositional phrases in shallow parsing level using these features. The system of Chinese Prepositional Phrase Identification in this thesis adopts a statistical model based CRF.In practice, using CRF model we can reach high accuracy with knowledge-poor features. Another advantage of CRF model is its reusability and the theory of CRF framework is independent of any particular natural language task. In the first experiment, use one-layer CRF model to identify all prepositional phrase and find the precision of nested phrase is lower. According to statistical, in mass texts the number of double nests is far from that of the three nests. Therefore, explore the CRF model solely to recognize prepositional phrases is enough. In the second experiment, Firstly, non-nested and inner parts of nested prepositional phrase are recognized in the first level with CRF. Secondly, outer parts of nested prepositional phrase are recognized in the second level with CRFs. Finally, combine the results of the two levels.The results show that the method of our system is efficient for Chinese Prepositional Phrase Identification: in open test, the precision reach 90.08%. Furthermore, this model has a good expandability which can be used to recognize other Phrases such as Verb phrase.
Keywords/Search Tags:Natural Language Processing, Prepositional Phrase Identification, CRF Model
PDF Full Text Request
Related items