Font Size: a A A

Research And Application Of Extractive Generative Summarization Technology Fused With Topic Information

Posted on:2022-10-10Degree:MasterType:Thesis
Country:ChinaCandidate:Y Y WeiFull Text:PDF
GTID:2518306326471614Subject:Software engineering
Abstract/Summary:PDF Full Text Request
The increasing development of the Internet has brought us into the era of information explosion.Most of the existing network information platforms have problems such as redundant data,numerous advertisements,and false information.Therefore,it is particularly important to obtain effective information from massive amounts of data quickly and accurately.To this end,using the automatic text summarization technology to help users obtain a large amount of reliable information in the shortest time has become a research hotspot in the field of natural language processing.At present,the widely used automatic text summarization technology is divided into three categories: extractive summarization technology,generative summary technology,and the combination of both extraction and generative summary technology.Extractive summarization technology based on statistics or rules is simple,adaptable,and fits the research theme.Yet,it has poor semantic comprehension.By contrast,deep learning-based generative text summaries have a stronger ability to understand and generate text descriptions,but they still have poor readability,summary off-topic,and other issues.Combining the two to achieve the text summarization task is the core idea of the extraction generative method,we conduct relevant research conditioned on this idea.The main contents are as follows:Aiming at the advantages and disadvantages of the traditional extractive summarization method and the generative summarization method based on deep learning,a sequence-to-sequence text summary generation model TICTS combined with the attention mechanism is proposed.First of all,the generation of text descriptions is mainly based on the Seq2 Seq model.The Bi-GRU model is used on the model encoding side.Its bidirectional neural network can retain the context information more completely,and the problem of information loss has been further improved.Secondly,in order to solve the problem of the lack of key information control and guidance in the abstract generation process of the generative abstract technology,this paper uses the K-Means algorithm based on word vectors to cluster documents,filter the topic information of the text,and build a topic attention mechanism which is combined with the Bahdanau attention mechanism to capture the correlation between input and output words,input and topic information.Finally,the decoder adopts the LSTM model,which improves the topic relevance of the generated summary based on making full use of the text context information.The model uses the LCSTS data set and the ROUGE evaluation index is used to evaluate the model.The experimental results verify the effectiveness of the model.We designed and implemented a news briefing system.By applying the TICTS model to the extraction of news summaries,we provide users with a simple and convenient news acquisition platform.To meet the needs of users,our designed system function modules include a news summary browsing module,news classification module,news crawling module,summary generation module,and text-audio conversion module.Among them,news summary browsing and summary generation are the core function modules of the system.We ultimately realize the data interaction between the front-end news interface and the back-end news release system.Therefore,the automatic text summarization technology not only effectively solves the problems of redundancy,one-sidedness,and impurity information in news websites,but also improves people's reading efficiency,avoids headline party news,and satisfies our increasing demand for fragmented reading.All in all,our powerful system has distinct functions and great application value and application prospects.
Keywords/Search Tags:Automatic text summarization, Extraction generative, Clustering, Attention mechanism, News briefing system
PDF Full Text Request
Related items