Font Size: a A A

Chinese Prepositional Phrase Recognition Based On Fine-grained Phrase Information

Posted on:2019-06-10Degree:MasterType:Thesis
Country:ChinaCandidate:T LiuFull Text:PDF
GTID:2428330566984202Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The use of prepositional phrase is very frequent in Chinese,it's complex and changeable structure caused the difficulty of recognition.The accuracy of its recognition will affect the result of a series of parsing tasks.In Natural Language Processing research,improving the recognition effect of prepositional phrases can reduce the complexity of syntactic analysis,improve the classification effect of text classification,and improve the performance of Machine Translation greatly.In this paper,by analyzing the grammatical features of prepositional phrases and referring to the research status and difficulties in recent years,a multi model fusion prepositional phrase recognition method based on fine-grained phrases is proposed,which is mainly improved for complex prepositional phrases such as nested juxtaposition.It not only identifies parallel prepositional phrases,but also improves the recognition accuracy of embedded prepositional phrases.First,a fine-grained phrase recognition model is used to identify and merge the phrases in the corpus in order to reduce internal complexity of prepositional phrases;Then,the CRF model is used to identify the inner layer of the nested prepositions phrases,i.e.if the preposition phrases is nested,recognize the inner layer,otherwise,recognize the whole preposition phrase;Finally,merge the recognized inner prepositional phrases in the corpus and modify the feature information in order to train a new model for outer prepositional phrase recognition.In addition,after the recognition of both inner and outer prepositional phrases,a double error correction system is used to correct the recognized phrases.The method of fusing fine-grained phrase to simplify sentence structure while keeping sentence information,and shorten the span of prepositional phrase.The hierarchical nested multi model prepositional phrase recognition method identifies the same level prepositional phrases at the same time,and uses different models to identify different layer's prepositional phrases,which is more suitable for the existence of nested and parallel prepositional phrases.The double error correction system makes use of the rule method to combine statistics and rules to further improve the experimental results.Five-fold experiments are conducted on the corpus of People's Daily of 2000 including 7028 prepositional phrases,and the results achieve 94.33% in precision,94.28% in recall,and 94.30% in F-measure,which are improved by 1.31%,1.33%,1.32% respectively than the simple noun phrases based prepositional phrase identification method(baseline).
Keywords/Search Tags:Fine-grained Phrase, Word Segmentation Fusion, Hierarchical Nested Structure, Double Error Correction System
PDF Full Text Request
Related items