Research On Short Text Summarization Generation Method Based On Deep Learning

Posted on:2022-02-13

Degree:Master

Type:Thesis

Country:China

Candidate:L Y Zhou

Full Text:PDF

GTID:2518306575969209

Subject:Electronics and Communications Engineering

Abstract/Summary:

PDF Full Text Request

With the continuous development of the Internet,text information has exploded.The problem of information overload has seriously affected people’s access to effective information.Automatic summarization technology can rapidly extract valuable information,but there are still some problems in the generative method in automatic summarization.For example,in the generation process,due to inaccurate word segmentation and inaccurate feature information extraction,the generated summary is not smooth,insufficient accuracy,and incomplete information.In response to these problems,this thesis studies a new abstract generation method,the main work is as follows:1.Aiming at the problem of inaccurate final results due to inaccurate word segmentation,this thesis uses character-level word embedding methods to finally generate the BERT word vector as the input of the subsequent model.The character-level word embedding method is used in the input part of the BERT model.Based on the characteristics of a single Chinese character can be formed into a word,a single Chinese character is used as the input,avoiding the word segmentation step,and generating high-quality character vectors.The experimental results show that when the output result has a high accuracy rate,compared with the general input BERT model,the training efficiency has been greatly improved,and the training time has been shortened.2.Aiming at the problem of inaccurate results due to inaccurate feature extraction in the process of generating abstracts,this thesis uses the above-mentioned BERT character vector combined with the improved LeakGAN model to generate results.The thesis uses the LeakGAN model of hierarchical reinforcement learning strategy.By adding the attention mechanism to the discriminator part of the model,the key feature information is extracted,the global and local associations are captured,and the BERT character vector is combined to obtain high quality feature information to generate a more accurate summary.The characteristic information of the generated more accurate summary.This thesis uses the large-scale Chinese short text summary data set LCSTS as the data set for the simulation experiment.Through the automatic scoring of ROUGE and manual evaluation methods,compared with the extractive text generation model and the generative benchmark model Seq2Seq,the abstract generation method studied in this thesis has a certain degree of improvement in the accuracy and fluency of abstract generation.

Keywords/Search Tags:

Generative summary, BERT model, Character-level word embedding, Attention mechanism, LeakGAN model

PDF Full Text Request

Related items

1	Research On Automatic Document Summary Based On Generative
2	Research Of Sequence Labeling Model Based On Fine-grained Word Representations
3	Research And Implementation Of Bait Document Generation Based On LeakGAN
4	Chinese Automatic Summary Model And Its Applications
5	Research On Chinese Text Summarization Technology Based On BERT-KA-PGN Model
6	Research On Extraction Summary Generation Technology Based On Attention Mechanism
7	TMSA:A Two-stage Autimatic Summary Generation Model
8	Research On Text Emotion Classification Based On BERT Embedding
9	Sentiment Analysis Of Microblog Incorporating Dynamic Character And Word Features
10	Research On Multi-feature Perception Entity Relation Extraction Model Based On Attention Mechanism