Font Size: a A A

Research Of News Text Summarization Based On Deep Learning

Posted on:2022-11-16Degree:MasterType:Thesis
Country:ChinaCandidate:S R YangFull Text:PDF
GTID:2518306749471984Subject:Automation Technology
Abstract/Summary:PDF Full Text Request
With more and more news on the Internet,it is becoming more and more difficult to obtain the required information if you want to find the content you want in a large number of texts in a short time and efficiently.Therefore,text summaries have become indispensable.Extractive abstracts have achieved good results,but the extracted abstracts are incoherent and poorly readable.Although generative summarization can generate relatively smooth sentences,it is very easy to deviate from the topic,generate wrong summaries and repeated sentences.This thesis combines the two summarization modes and proposes a hybrid generation method.In addition,in order to solve the problems of inaccurate summary generation,rigid vocabulary generation,and inaccurate text summary evaluation in the current text generation summary,the automatic text summary technology is improved as follows:(1)For the problem of sentence vector representation without using context to obtain semantics,traditional extractive abstracts mainly use methods that do not involve training based on features,graphs,etc.,the method based on recurrent neural network is also only based on one-way semantic vector.This thesis uses an improved BERT encoding vector to obtain context semantics,so as to obtain a better abstraction,and at the same time,it can remove redundant information for the generation method and retain key information.The ROUGE score is improved by an average of 3.09 points when compared with the traditional abstract abstraction.(2)In view of the lack of novelty of abstracts,slow generation speed,and the problems of unregistered words and repeated words,mainstream generative methods often use hybrid pointer generation networks.This thesis proposes a generation method that combines Transformer and improved pointer generation network,uses self-attention mechanism to encode context vectors better and faster,increases the ability of abstracts to select new words,and increases the novelty of abstracts.(3)Aiming at the problem that the existing abstract evaluation methods cannot evaluate the quality of abstracts well,a new abstract evaluation method,BERT-EVAL,is proposed.The semantic similarity calculation is performed on the generated two sentence vectors through the BERT model trained on semantic similarity.Compare the reference abstract and the generated abstract from a semantic perspective.This thesis proves the effectiveness of this model through experiments,and the ROUGE scores on the NLPCC2017 Chinese data set are increased to 39.78%,25.69%,and 35.15%,respectively.Aiming at the newly proposed abstract evaluation method,this thesis selects data from the English data set and conducts manual evaluation.The manual evaluation scores are compared with the ROUGE score and the BERT-EVAL score proposed in this thesis.The experiment proves that the latter has better results.Explain the effectiveness of the evaluation method.
Keywords/Search Tags:text summarization, BERT, Transformer, Pointer Generator network, text summary evaluation
PDF Full Text Request
Related items