Font Size: a A A

Eliminate The Ambiguity Of Relation Words In Compound Sentences Based On Rules And BP Neural Network

Posted on:2019-03-19Degree:MasterType:Thesis
Country:ChinaCandidate:Y XiongFull Text:PDF
GTID:2428330548967493Subject:Software engineering
Abstract/Summary:PDF Full Text Request
The complex sentences relation word is a bridge connecting the Chinese compound sentences,it is the starting point for the study of Chinese compound sentences.The prerequisite for automatic recognition of complex sentences is to correctly segment words in complex sentences,sentences segmentation and disambiguation of complex sentences are the basis for automatic recognition of complex sentences.There are many mistakes in segmentation and part of speech tagging of Chinese complex sentences,thus causing difficulties in the recognition of relation words.The main purpose of this paper is to make a re-segmentation on the condition that the compound sentence is cut together with other words,and the results of segmentation for part of speech tagging,thus completing the disambiguation process of segmentation ambiguity.Firstly,the participle processing of compound sentence corpus is done through the segmentation system NLPIR of Chinese Academy of Sciences,annotate and extract the compound sentence material below 6 sentences pattem.In the process of extraction,we analyze the results of word segmentation,and summarize the rules of ambiguous words segmentation.Then,we study the situation of the segmentation of the relation words with other words,and extract the compound sentences,establish the rules according to the characteristics of these compound words,and use the relation word ontology library and the rules to retrieve the segmentation ambiguity field and make the correct segmentation.Finally,we built the BP neural network model and trained it.We extracted the part of speech features of the context words that cut the ambiguous fields,and quantified the word weight of the words,eventually mapped to the nodes of the neural network,and reversed the error back.we got the right weight to mark the parts of speech,so as to achieve the purpose of disambiguation.In the process of experiment,we selected seven common compound sentences and extracted the compound sentences from the CCCS complex sentence corpus,and the correct rate of disambiguation was 93.4%.Thus,the combination of rules and BP neural network is effective in dealing with the segmentation and disambiguation of complex sentences.
Keywords/Search Tags:Relation words, Segmentation disambiguation, BP neural network, rules
PDF Full Text Request
Related items