Research On Generative Automatic Summarization Based On Deep Learning

Posted on:2022-06-08

Degree:Master

Type:Thesis

Country:China

Candidate:H Zeng

Full Text:PDF

GTID:2518306575474124

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

The information explosion has led to increasing difficulty in obtaining useful information.Therefore,how to effectively solve the problem of data overload,so that users can efficiently obtain important information contained in massive text data has become one of the key difficulties that industry and academia need to solve.Nowadays,text summarization technology has become one of the mainstream text information extraction technologies,including two methods: extraction and generative.The rapid development of deep learning technology has made great progress in different research fields of natural language processing technology,so it has also been introduced into the field of text generation research by many researchers.This thesis focuses on the research of generative text summarization technology based on deep learning,mainly focusing on generative text summarization based on pre-trained language models,and improving the factual accuracy.The main contributions are as follows:First,a sequence-to-sequence generative text summarization model based on a multilayer Transformer is proposed.The model is based on the Transformer network using the attention mechanism.First,the input text sequence is converted into three embedding representations: token embedding,position embedding,and segment embedding,and then it is input into the pre-trained BERT model for fine-tuning,through a sequence-to-sequence masking mechanism.This mechanism controls the information visibility during the text generation process.The model utilizes the good performance of the pre-trained language model in semantic understanding and semantic representation,effectively incorporating real-world knowledge into the abstract generation process,and at the same time,reducing the scale of training data.In addition,the influence of different decoding algorithms(Topp,Top-K and cluster search)on the results of this model is studied.Secondly,a method for evaluating fact accuracy of generative text summarization combining question generation and question answering technology is proposed.This method first uses the question generation model to generate corresponding questions from the results of the generative text summary,and then generates answers to the summary and the original context through QA technology,and finally compares the similarity of the two answers to obtain factual accuracy of the text summary results.The experimental results show that the generative text summarization model based on the pre-trained language model proposed in this thesis is better than the benchmark model.The factual accuracy evaluation method proposed in this thesis can better evaluate the summarizations compared with the traditional automatic evaluation method ROUGE.

Keywords/Search Tags:

Deep neural network, Pre-trained language model, Generative text summarization, Beam search, Factual accuracy

PDF Full Text Request

Related items

1	Research And Application Of Related Techniques For Text Summarization Based On Deep Learning
2	Research On Abstractive Text Summarization Based On Pre-trained Language Model
3	Research And Application Of Factual Correctness Technology For Automatic Text Summarization Based On Deep Learning
4	Research And Implementation Of Text Summarization System Based On Pre-Trained Language Model
5	Research On Factual Problems In Text Summarization
6	Research On Text Summarization Technology Based On Deep Neural Network
7	A Study On Deep Neural Network-based Text Generation Method
8	Research On Key Technologies Of Generative Model Based On Abstracts Of Chinese Scientific Papers
9	Research On News Text Summarization Algorithm Based On Pre-trained Language Model
10	Research And Application Of Automatic Text Summarization Technology Based On Deep Learning