Font Size: a A A

Research On Writing Imitation Based On Semantic Unit Substitution

Posted on:2018-01-10Degree:MasterType:Thesis
Country:ChinaCandidate:D D NingFull Text:PDF
GTID:2348330536981914Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Natural language generation,as an important part of Natural Language Processing,is being studied and utilized by major universities and research institutions.With the prosperous of deep learning and big data,there is a breakthrough in the generation of natural language,such as dialogue system,news automatic generation and so on.The college entrance examination robot project which contains in National 863 Project,is a specific aspect of Natural Language Processing.Among them,Chinese writing is the most important part,it is a great challenge for the computer to generate a composition automatically according to the title.At present,the method of generative writing is divided into three steps: Purposive analysis-sentence extraction-sentence ordering.However,this method deeply relies on the corpus,and when the topic is less involved in the composition corpus,it will lower the quality of the composition.In order to solve this problem,we put the method of imitating writing forward.The composition imitates the writing,according to the existing related topic writing essay,imitates its sentence pattern and to generate another writing.Based on the character information of sentences in the composition,we propose a word level and sentence level writing imitation method,and uses different ways to imitate the sentences with different roles.The word level imitation writing is mainly aims at the idea sentences and evidence sentences in the composition.Firstly,the method obtains a sentence template according to certain rules.During the second step we use similarity,synonyms,Bi-gram,and sentence context feature information to obtain the candidate word set.Finally,we use the method of language model and the words with the maximum likelihood are replaced.According to the results of the substitution,it can be concluded that the set of candidate words obtained by contextual features are the best.The imitation writing based on sentence level is mainly aims at the example sentences in the composition,because of the one-to-one correspondence between characters and deeds in case sentences,using this writing method may cause mismatches in characters and events.Thus we propose the sentence level imitation writing method,also can be understood as sentence level paraphrase problem.To solve this problem,we try to the sentence level paraphrase generation based on seq2 seq model.This paper first tried the basic seq2 seq model for sentence paraphrasing,and then join the attention mechanism,by comparing sentence generation the result shows that the model with attention is better.In addition,we propose the copy mechanism and the coverage mechanism to improve the model.Among them,the copy mechanism aims to solve special condition when names and places are present in original sentence.Under this condition,we hope that the model can copy words without change.Experimental results show that the copy mechanism can improve the situation and generate better sentences,in addition,to solve the common repetition problem of seq2 seq,we add coverage mechanism on the basis of copy mechanism,which effectively improves this problem in sentences generation.
Keywords/Search Tags:writing imitation, seq2seq model, attention mechanism, copy mechanism, coverage mechanism
PDF Full Text Request
Related items