Font Size: a A A

Research On Chinese Prepositional Phrase Identification Based On Simple Noun Phrase

Posted on:2017-06-06Degree:MasterType:Thesis
Country:ChinaCandidate:L Y SangFull Text:PDF
GTID:2348330488458681Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
As a class of important phrase, prepositional phrases' structures are complex, accounting for a rather large proportion in Chinese sentence. The prepositional phrases recognition has been always emphasis of natural language processing, which simplifies the structure of sentence, reduces the complexity of sentence and the number of candidate main verbs, and makes the parsing easily, increases template matching in translation.Based on analysis of Chinese prepositional phrase identification difficulties and research status, we propose a new approach integrating simple noun phrase information into prepositional phrase recognition. We recognize simple noun phrases through basic CRF model, and filter the phrases with rules in order to adapt to the inner phrase patterns in the preposition phrases. Then we utilize the simple noun phrases to merge fragment participles into a complete phrase in our corpus. Finally, we recognize the preposition phrases through multi-layer CRF model, and use double error correction system to correct the result. Based on the advantage that simple noun phrases can reduce the ambiguity problems and retain the sufficient grammatical information, the information not only can simplify the structure of the sentence, also reduce contradiction between the CRF model's limited window and the long distance dependence between the prepositional phrase and the context information. Double error correction system combines linguistic knowledge and statistical methods, reduce the data sparseness problem of statistical model, effectively improves the recognition result of prepositional phrases.The result shows that our method of simple noun phrase information is efficient for Chinese prepositional phrase identification. The precision, recall and F-value of the experiment that the corpus are People's Daily 2000 containing 7049 prepositional phrases are 93.10%,93.02%,93.06%. Our method of prepositional phrase identification could apply to the complex sentence translation and template matching.
Keywords/Search Tags:Simple Noun Phrase, CRF Model, Participle fusion, Transformation Rule Set
PDF Full Text Request
Related items