Font Size: a A A

Research On The Method Of Generating News Text Summaries Fused With Keywords

Posted on:2021-08-20Degree:MasterType:Thesis
Country:ChinaCandidate:S NingFull Text:PDF
GTID:2518306200953599Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In recent years,with the development of the Internet,information redundancy has become a major problem restricting people to quickly understand the latest information.News text summary generation has become an indispensable technical measure for people to achieve fast reading.Fusion of more feature information to generate more accurate and readable text abstracts is the mainstream direction of abstract generation research.Aiming at the shortcomings of the existing keyword extraction methods and abstract generation methods in Chinese news texts,combined with the short length of Chinese news texts and the highly concentrated information,according to the research status of text summaries,this article conducts Chinese news fusion with keywords Research work on text summary generation methods.Mainly completed the following research work:(1)Propose a method for extracting news text keywords from the differences between LSTM and LDA.Aiming at the impact of semantic information on Text Rank,and considering the characteristics of high concentration of news headline information and the coverage and difference of keywords,a method of keyword extraction based on the differences between LSTM and LDA was proposed.Firstly preprocess the news text to obtain candidate keywords;secondly,use the LDA topic model to obtain the topic difference influence degree of the candidate keywords;then combine the LSTM model and word2 vec model to calculate the semantic relevance influence degree of the candidate keywords and the title;The candidate keyword nodes perform non-uniform transfers according to the topic difference influence degree and semantic relevance influence degree to obtain the final candidate keyword ranking and extract keywords.This method combines different attributes of the semantic importance,coverage,and difference of keywords.The experimental results on Sogou News Corpus show that the extraction results of this method have significantly improved accuracy and recall compared with traditional methods.(2)Propose a Chinese news text summary method based on keywords.In view of the fact that the existing seq2 seq model is prone to semantically irrelevant abstract words when generating abstracts,combined with the short length of news text and the high concentration of information,we emphasize the role of keywords in generating Chinese news abstracts and propose a fusion Method for generating Chinese news text summaries based on keywords.First,the source text words are input into the Bi-LSTM model in turn;second,the obtained time step hidden state is input into a sliding convolutional neural network to extract local features between each word and neighboring words;then,the key is used The word information and gating unit filters the news text information to remove redundant information.Finally,the global feature information of each word is obtained through the self-attention mechanism,and the hierarchically combined global word feature representation is obtained by encoding.The word features obtained are input into the LSTM model with attention mechanism and decoded to obtain summary information.The method in this paper models the n-gram features of news words by sliding convolutional networks,and uses the self-attention mechanism to obtain hierarchical local and global word feature representations.At the same time,we consider the important role of keywords in generating news digests,and use the gating unit to remove redundant information to obtain more accurate news text information.Experiments on Sogou News Corpus show that the method proposed in this paper can effectively improve the quality of abstract generation and can effectively improve the ROUGE-1,ROUGE-2,and ROUGE-L values.(3)Prototype system for generating Chinese news text summariesBased on the news text keyword extraction method based on the differences between LSTM and LDA and the Chinese news text summary generation method based on the fused keywords,and the corpus collected by experiments,a prototype system of Chinese news text summary generation based on keywords was designed and constructed.First,we introduced the tools and framework used to build the system.Secondly,we mainly introduce the main functions implemented by the system,and elaborate on the design process of the system.In the end,we show the results of news keyword extraction and news text summary generation.
Keywords/Search Tags:News keyword extraction, news summary generation, keyword information fusion, gating unit, global coding
PDF Full Text Request
Related items