Font Size: a A A

Research On Essay Generation Based Extraction

Posted on:2019-03-13Degree:MasterType:Thesis
Country:ChinaCandidate:H T LengFull Text:PDF
GTID:2428330566498097Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In 2015,China launched a deep research on the project of Answering Robot for the College Entrance Examination.In college entrance examination,the composition of Chinese accounts for a large proportion.If you want to get a satisfactory score,excellent composition is essential.For this reason,we have carried out a deep study on essay generation based extraction.The generation of essay based extraction need to extract the required sentences form a large number of corpus at first.For the college extrance examination composition,the required sentences are all around the given topic,while the college entrance examination composition requires a rich sense of semantic level.Therefore,we propose a method based on semantic extraction and extension based on relevance.In this method,we first obtain sentences which are closely related to the topic by semantic information,but if we just use these sentences to produce the composition,the semantic scope of the composition is too narrow,and it will make people feel these sentences are semantic reiteration.Therefor,we also extend the relevant sentences in the result of sentence extraction.Specifically we propose a sentence extension method based on association rules and a sentence extension method based on LDA.Through experiments,the two sentence extension methods can effectively extract sentences related to original sentences.In particular,the LDA method can extract more abundant related sentences to solve the problem of semantic narrowness caused by only extractin by semantic information.Not only this,the sentences extended by the LDA method greatly enriches the sentence types of the candidate sentences.In the only extraction method,the candidate sentences are almost all types of statements and summary,and the LDA method can get sentences of other types,such as argumentation and allusions.On the basis of getting the candidate sentences,we need to generate the paragraph text.In this module,we use sentence ranking mthods to generate the paragraph text.We use the Learing to Rank based on statistical machine learning as our baseline method,and use the pair wise method and ptr-net method based on deep learning.Two deep learning methods avoid complex feature engineering.Through the experimental results,we can find that the two deep learning methods have achieved good results,in which the ptr-net is leading 5 percentage points in the accuracy rate of baseline method because of its advanced principle.After obtaining the textual content of each paragraph,we nned to have discourse level layout,so we propose paragraph ranking based on discourse text.We use hierarchical PtrNet paragraph ranking model and the skip-thought based Ptr-Net paragraph ranking model as our baseline model,and propose a paragraph ranking model based on key sentence extraction which is closer to the true state of human writing.This method has a qualitative improvement over the two baseline methods.in the process of composition generation,we can not use the tool provided by the Xunfei directly.Therefor,we migrate the Summ Ru NNer model on the abstract task to the key sentence extraction task,and we us key sentence extraction method based on Hierarchical Attention as our baseline method.The result shows that the Summ Ru NNer method is better than thee baseline method.It improves the accuracy rate by nearly 3 percentage points,and raises 4 percentage points in the quasi accuracy rate.
Keywords/Search Tags:sentence extraction, sentence ordering, paragraph ordering, text generation
PDF Full Text Request
Related items