Font Size: a A A

Research On Automatic Text Summarization Based On TextRank

Posted on:2020-08-20Degree:MasterType:Thesis
Country:ChinaCandidate:N N LiFull Text:PDF
GTID:2438330575459330Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the development of network,the scale of Internet users has become increasingly large.The application of various network platforms and apps makes the literature resources on the network show an exponential rise.In the current fast-paced life,how to quickly and efficiently obtain high-quality and effective information of these network resources,namely text summary,has become a hot research direction.Based on this,this paper studies the algorithm of summary generation from the perspective of the quality of the summary presented to users.The main work of this paper is divided into three aspects:(1)An optimization algorithm for text summarization based on Text Rank algorithm is proposed.Firstly,based on the objective factors that can affect the quality of abstract generation,this paper adds the text structure information of the document and the context information of the sentence and other factors,such as the physical location of the document sentences or paragraphs,the characteristic sentences,the core sentences and so on,which may enhance the weight of the sentences.These factors are processed and adjusted digitally and combined with the improved Text Rank algorithm to generate the candidate sentence groups of the text summary.Because of the influence of the similarity between sentences,the obtained candidate sentence group is treated as redundancy,and the similarity calculation method is improved to remove the sentences with high similarity in the candidate sentence group and obtain the final text summary.The resulting summary has low redundancy,high generality,readability and consistency.This algorithm can improve the accuracy of abstract generation,which shows that this algorithm can improve the quality of the generated abstract,and shows the effectiveness of this algorithm.(2)A text summary optimization algorithm based on the topic model and Text Rank algorithm was proposed.Firstly,the main idea expression of the abstract is considered,and the topic model LDA is combined with Text Rank algorithm to extract the topic keywords of the text and form the keyword set.Secondly,based on the keyword set,the word weight matrix of the constructed Text Rank network graph is adjusted,and then the improved Text Rank algorithm is used to extract the candidate abstract sentence set of the article.The similarity between sentences is used to reduce the redundancy,extract the sentence,get the final summary,and output it according to the order in which the sentence appears.The abstract generated by the algorithm can reflect the core thought content of the original text to a large extent,indicating the effectiveness of the method in this paper.(3)An automatic text summary generation system based on Text Rank algorithm is designed and implemented.Using the above two algorithms as the theoretical basis of system design,first,analyze and design the structure of the system and the corresponding functional modules,and then realize the automatic text summary generation system based on Text Rank algorithm.The system generates textual keywords,reproduces textual abstracts,and presents the results to the user for an immediate processing effect.
Keywords/Search Tags:abstract extraction, Text Rank, LDA, candidate abstract sentence group, redundant processing
PDF Full Text Request
Related items