Font Size: a A A

Research On Automatic Summarization Of News Text Based On Neural Network

Posted on:2021-01-01Degree:MasterType:Thesis
Country:ChinaCandidate:S X WuFull Text:PDF
GTID:2428330626460357Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
News text is the media information carrier that appears the most and the most in daily life.With the rapid development of the Internet,news texts have shown an explosive growth trend.The emergence of massive news texts poses a huge challenge to users' reading.How to use computers to automatically generate news text summary information to assist users to quickly browse news and improve reading efficiency has become an important research topic.At present,the automatic summarization technology of news text is mainly divided into two directions: extractive and abstractive.The former extracts a certain number of sentences from the text as an abstract,mainly for long texts;the latter generates abstracts after reading and understanding texts,usually for short text this.The two have their own advantages in different types of processing objects.This article starts from the above two directions.A multi-feature fusion model is proposed for the extractive method to solve the problem of insufficient feature mining in automatic text summary tasks.Specifically,the four features of sentence vocabulary,relative position,length and similarity between sentences are selected to construct a summary system based on multi-feature fusion model.Among them,the lexical features based on the syntax tree make full use of grammatical information,eliminating the limitations of traditional methods to obtain keywords;the relative position feature assigns sentences by obtaining high-level information of the location;the length feature filters out long sentences;based on smoothing The inverse frequency sentence embedding method constructs sentence vectors and effectively calculates the similarity between sentences.Aiming at the problem of insufficient utilization of the overall semantic information of abstracts in decoding by the current abstractive automatic summarization model,a neural network automatic summarization method based on semantic alignment is proposed.This method is based on the Sequence-to-Sequence model with attention,Point mechanism and Coverage mechanism,and a semantic alignment network is added between the encoder and the decoder to achieve the alignment of the semantic information of the text to the abstract;then,the obtained summary as a whole Semantic information is concatenated with the decoder's vocabulary prediction context vector,so that when the decoder predicts the current vocabulary,it not only uses part of the semantics of the predicted vocabulary sequence,but also considers the overall semantics of the predicted abstract to improve the quality of the automatically generated text summary.Experiments were conducted on the NLPCC2017 news text automatic summary evaluation corpus and LCSTS large-scale news text corpus,and the proposed methods all improved the quality of the summary generation.However,extractive abstracts need to be further improved in terms of semantic coherence.Abstractive summarization still have room to improve OOV(out of vocabulary)and accuracy.
Keywords/Search Tags:News Text, Multi-feature Fusion, Semantic Alignment Network
PDF Full Text Request
Related items