Research On Key Issues Of Automatic Text Summarization Technology

Posted on:2021-05-23

Degree:Master

Type:Thesis

Country:China

Candidate:J Ma

Full Text:PDF

GTID:2428330623467775

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

With the rapid development of Internet technology,network has become the main source of information.Rich and diverse information resources bring great convenience to people,but massive text information has also caused people a lot of trouble.How to get key information quickly from a large number of text information on the Internet has become a challenge.Compressing and extracting text information using automatic text summarization technology has become an effective way to obtain high-quality text information in the era of information explosion.This paper focuses on the automatic text summarization technology,and mainly focuses on the deep learning based abstractive summarization and dialog text summarization.The mian works of this paper are listed as follows:(1)There are two problems in the Encoder-Decoder based abstractive summarization method,one is the evaluation metric is different from the objective function,the other is exposure bias.Using generative adversarial network(GAN)can solve the above problems,but brings the problems of difficult to optimize discrete data and generate text under conditions.In order to solve these problems,this paper combines the advantages of two methods to introduce adversarial training in the traditional Encoder-Decoder framework.After pre-training the Encoder-Decoder with good performance,it learns and optimizes the encoding of the complete sequence by generative adversarial network.The evaluation metric guides the optimization of the model,and the problems of discrete data processing and conditional generation are avoided.Experiments show that the proposed method can improve the performance of the abstractive summarization model.(2)Due to the lack of large-scale standard data set provided by relevant conference,dialogue text summarization task is difficult to use end-to-end modeling with deep learning models.Compared with the essay text,the full length of the dialogue text is longer,the sentence length is shorter,and the topic is discrete,so the traditional method of essay text summarization can not achieve good results.As for the problem of word outof-vocabulary(OOV),we use the method of named entity recognition to reduce the problem of OOV in dialogue text.As for semantic vector representation,this paper proposes a temporal self-supervised encoder,which can construct a dialog sentence vector with temporal information.To solve the problem of discrete distribution of topics,we can reasonably divide the dialogue text into different topics through self-supervised segmentation model and unsupervised clustering to form a complete dialogue subset.Then,according to the characteristics of dialogue text,we propose two kinds of methods: abstractive summarization and template-based summarization.This method mainly uses unsupervised and self-supervised models for processing,which overcomes the problem of shortage of labeled samples.Through experiments on dialog data sets,the effectiveness of this method is verified.(3)With the above research and work,a prototype of automatic text summarization system based on web is designed and implemented.With simple operations,users can experience the automatic text summarization model in this paper on the web page side.

Keywords/Search Tags:

Automatic Text Summarization, Encoder-Decoder, Generative Adversarial Network, Dialogue Text, Self-Supervised Learning

PDF Full Text Request

Related items

1	Research On Key Issues Of Text Summarization Based On Automatic Machine Learning
2	Automatic Text Summarization And And Its Application In Aviation Safety Repors
3	Research Of Short Text Summary Generation Based On Text Structure Information
4	The Technology Of Automatic Text Summarization Based On Deep Learning
5	Research On Text Automatic Summarization Combined With Transfer Learning
6	Research On Text Summarization Technology Based On Improved Transformer Model
7	Research And Design Of News Text Summarization Based On Generative Adversarial Net
8	Weakly Supervised Learning Of A Generative Adversarial Nets For Semantic Segmentation
9	Research On Chinese Text Summarization Algorithm Based On Deep Learning
10	Design And Implementation Of Text Data Enhancement System Based On Generative Adversarial Network