Font Size: a A A

Paraphrasing Method Research Of Chinese Complex Sentences Based On Templates

Posted on:2017-06-12Degree:MasterType:Thesis
Country:ChinaCandidate:Y F LinFull Text:PDF
GTID:2348330485981726Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The way of expression in Chinese is rich and colorful, a sentence may have many different forms of expression without changing the semantics. A corpus of statement paraphrasing is established through the study of statement paraphrasing, generating and identify the same meaning of different expression forms of the statement. Research results and technology can be applied to many areas of Natural Language Processing. It has important significance for the research.Based on the paraphrasing of simple sentences in Chinese, this paper proposed a method to paraphrase Chinese complex sentences based on template. Through analyzing classification and grammatical structure of the complex sentences, extract the paraphrasing templates of complex sentences. By building the complex sentences corpus with associated words as a core, word segmentation and part of speech tagging, making massive similarity calculation experiments, which determines the threshold between paraphrasing sentences and paraphrasing templates. Finally, the original sentence is realized to paraphrase.Paraphrasing template extraction and matching is the key to paraphrase the sentence. In this paper, the part of speech is used as a template. In the process of template extraction, this paper will be divided into 4 types according to the number of variables, an original sentence corresponds to one or several paraphrase template. Variable matching is an important part in the matching of paraphrasing and paraphrasing template. In the process of paraphrasing, the statement to be paraphrased matches the template through keyword and part of speech. In addition, the mismatch components between the paraphrase statement and the template were bundled, and take the bundle as a variable.In the similarity calculation, we construct a similarity calculation algorithm according to the characteristics of complex sentences. In the algorithm, the length of the key words is increased. Experiment shows that the algorithm can effectively solve the complex sentence length on the accuracy of experiment. In addition, the difficulty lies in the fact that most of the original sentences have a number of key words in the same part of speech, in the process of reading each part of speech and reducing the sentence is very prone to confusion. In this paper, the method of position labeling and Key-Value were proposed. This method is a good solution to this problem.For sentence similarity computation, we get the threshold of the paraphrasing experiment, paraphrasing the experiment using the collected corpus sentence and complex sentence paraphrasing template. By manual evaluation method, we get paraphrasing the correct rate is 48.85%, paraphrasing coverage rate was 63.06%, the experimental results show that the method is feasible and effective.
Keywords/Search Tags:Complex sentence, Associated word, Paraphrasing template
PDF Full Text Request
Related items