Font Size: a A A

Research On Generative Automatic Summarization Based On Deep Learning

Posted on:2022-06-08Degree:MasterType:Thesis
Country:ChinaCandidate:H ZengFull Text:PDF
GTID:2518306575474124Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The information explosion has led to increasing difficulty in obtaining useful information.Therefore,how to effectively solve the problem of data overload,so that users can efficiently obtain important information contained in massive text data has become one of the key difficulties that industry and academia need to solve.Nowadays,text summarization technology has become one of the mainstream text information extraction technologies,including two methods: extraction and generative.The rapid development of deep learning technology has made great progress in different research fields of natural language processing technology,so it has also been introduced into the field of text generation research by many researchers.This thesis focuses on the research of generative text summarization technology based on deep learning,mainly focusing on generative text summarization based on pre-trained language models,and improving the factual accuracy.The main contributions are as follows:First,a sequence-to-sequence generative text summarization model based on a multilayer Transformer is proposed.The model is based on the Transformer network using the attention mechanism.First,the input text sequence is converted into three embedding representations: token embedding,position embedding,and segment embedding,and then it is input into the pre-trained BERT model for fine-tuning,through a sequence-to-sequence masking mechanism.This mechanism controls the information visibility during the text generation process.The model utilizes the good performance of the pre-trained language model in semantic understanding and semantic representation,effectively incorporating real-world knowledge into the abstract generation process,and at the same time,reducing the scale of training data.In addition,the influence of different decoding algorithms(Topp,Top-K and cluster search)on the results of this model is studied.Secondly,a method for evaluating fact accuracy of generative text summarization combining question generation and question answering technology is proposed.This method first uses the question generation model to generate corresponding questions from the results of the generative text summary,and then generates answers to the summary and the original context through QA technology,and finally compares the similarity of the two answers to obtain factual accuracy of the text summary results.The experimental results show that the generative text summarization model based on the pre-trained language model proposed in this thesis is better than the benchmark model.The factual accuracy evaluation method proposed in this thesis can better evaluate the summarizations compared with the traditional automatic evaluation method ROUGE.
Keywords/Search Tags:Deep neural network, Pre-trained language model, Generative text summarization, Beam search, Factual accuracy
PDF Full Text Request
Related items