Research On Chinese Text Summarization Algorithm Based On Deep Learning

Posted on:2021-01-13

Degree:Master

Type:Thesis

Country:China

Candidate:F X Ma

Full Text:PDF

GTID:2428330647461934

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

With the exponential growth of textual information available from the Internet,the problem of information overload is very serious.How to reduce the user's information load to perform "dimensionality reduction" is necessary,and automatic text summarization is an important method.With the development of deep learning,more and more researchers use deep learning technology to automatically generate summary for texts.This paper studies the summary generation method based on deep learning algorithm.The main work is as follows:Firstly,aiming at the problems of easy occurrence of unknown words and incomplete content in the summary generation process,a Chinese text summary generation algorithm based on keyword information and adversarial learning is proposed.The algorithm includes two stages: keyword extraction and summary generation.Firstly,the key words are extracted by using the attention mechanism-based Seq2 Seq model.Then,the semantic distance between the source text and the summary text is dynamically shortened through adversarial learning,and on this basis,the extracted keyword information is added to the attention mechanism,so that the model pays more attention to the key information of the source text and generates a more comprehensive summary.The experimental results on LCSTS data set show that the algorithm proposed in this paper can effectively improve the accuracy of the abstract and reduce the number of unknown words.Compared with Seq2 Seq method,the scores of ROUGE-1,ROUGE-2 and ROUGE-L are improved by 6.1%,4.8% and 6.2%,respectively.Secondly,aiming at the problem that the long-term dependence of the generative summary algorithm in the processing of long texts leads to low accuracy,this paper proposes a new long text summary algorithm,which includes topic sentence extraction and summary generation.In the topic sentence extraction phase,doc2 vec was added to improve the text similarity calculation method in Text Rank and the accuracy of key sentence extraction.In the summary generation phase,the key sentence obtained in the previous phase is used as the input for the summary generation,and a gated unit including CNN and self-attention mechanism is added between the encoder and decoder of Seq2 Seq to extract n-gram information controls the information flow of the model and eases the duplication of words in the generated summary results.The experimental results on the crawled Sina Finance News dataset show that this method is better than a single extraction or generative method in terms of accuracy when processing long texts.The above work provides a new research idea for the automatic generation of text summary.The method proposed in this paper also has a significant improvement in the rouge score and has better practicability in alleviating the problem of information overload.

Keywords/Search Tags:

Text summarization, adversarial learning, Seq2Seq, Text Rank, attention

PDF Full Text Request

Related items

1	Research On Text Summarization Algorithm Based On Deep Learning
2	Seq2seq Attention:Super Long Chinese Text Summarization Model
3	Research And Application Of Automatic Text Summarization Technology Based On Deep Learning
4	Research On Text Summarization Generation Technology Based On Neural Network
5	Improved Seq2Seq Text Summarization Generation Methods
6	Research On Automatic Generation Method Of Chinese Text Summarization
7	Improved Attentional Seq2seq With Policy Gradient For Text Summarization
8	Research And Implementation Of Automatic Text Summarization Based On Seq2Seq Model
9	Research On Automatic Generation Of Chinese Text Abstract Based On CNN
10	The Research On Abstractive Text Summarization Method Based On Reinforcement Learning And Sentence Level Evaluation