Font Size: a A A

Research On Automatic Document Summary Based On Generative

Posted on:2022-05-04Degree:MasterType:Thesis
Country:ChinaCandidate:X X XuFull Text:PDF
GTID:2518306350993809Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Automatic document summary technology is an important means to reduce the dimension and compress all kinds of information.The emergence of automatic text summary technology is undoubtedly an efficient solution,which can help people save a lot of reading time and obtain more effective information at the same time.It has been widely used in many fields such as abstract generation,Title generation,and question answering system.The generative automatic document summary model needs to fully understand the content of the document,capture the important information of the document,and express it,to generate the target summary.However,the computer can not understand and master the theme and core meaning of the text like human beings.Therefore,it is a challenging task to realize an automatic summary of generated documents in practical engineering.The main purpose of this research is to improve the quality and readability of automatic summary.Based on the original automatic summary model,this paper optimizes and improves the automatic summary model,and proposes a new automatic summary model with an intermediate content vector added to the seq2 seq model.First,use the word vector model to vectorize the document and process it into a vectorized document form acceptable to the seq2 seq model.This process can increase the understanding of the document semantics in the encoding stage,and then store the generalized documents into the intermediate content vector.This process can effectively promote the attention mechanism to retain important information and eliminate noise more effectively To make the seq2 seq model play a better role.In the process of model training,the introduction of intermediate content vector and attention mechanism can obtain better information features of documents,promote the absorption of text features by the seq2 seq model,and improve the level of the abstract to a certain extent.Based on successful model optimization,fine-tuning the word vector dimension and optimizer makes the final summary model generate more cost-effective abstracts.The optimized generative document automatic summary model in this paper is superior to other models in the same field in terms of the quality of the summary generated on its corresponding data set.The experimental results show that this optimization of the original model is successful.Besides,after multiple sets of comparative experiments,it is proved that the word vector dimension,the summary length setting on the specified data set,and the number of rounds of model training is all very helpful to improve the quality of the summary.In a word,the generative document automatic summary model optimized in this paper has been greatly improved based on the original model,and the quality of the summaries obtained is significantly better than other model methods,which promotes the research of the entire Seq2 seq model.
Keywords/Search Tags:Seq2seq model, Word to vector, Automatic document summary, Attention mechanism
PDF Full Text Request
Related items