Font Size: a A A

Research On Automatic Summarization Of Chinese Literature Based On TextRank Algorithm

Posted on:2020-09-07Degree:MasterType:Thesis
Country:ChinaCandidate:J ZhaoFull Text:PDF
GTID:2428330590958536Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
While the big data era generates huge literature resources,the problem of information overload has arisen.The number of text has far exceeded the limit of manual processing.With the help of automatic summarization technology,we can quickly understand the core content framework of the full text by generating a refined "mind map" of the text.Document retrieval databases can compare and check papers more accurately and effectively.Various databases will produce more accurate indexes which is convenient for readers.The research on automatic summarization technology,which originated in 1958,has been paid more attention by the industry.TextRank is one of the representative graph-based algorithms.It's an unsupervised method,and it can be directly applied to single document without corpus.TextRank divides text into nodes which are composed by text units,and these nodes with similarity form edges.Then,all the edges establish a graph model.By calculating iteratively in TextRank formula,sentences with the highest weight score are selected as abstracts.This paper studies Chinese literature for single document,and analyzes computer literature with title and summary keywords.By Integrating into text features such as title,abstract keywords and sentence position,weight algorithm is redesigned to generate abstracts.The experimental corpus comes from 50 Chinese computer literatures randomly downloaded from CNKI,and select a chapter with title of these 50 articles,while retaining the abstract keywords.In this paper,the average accuracy P,average recall R and average F are used to compare the automatic summarization effect of the improved sentence weight algorithm and TextRank algorithm.The results show that the quality of summary generation has been improved to a certain extent.The parameter ? of the final sentence weight of the improved algorithm is determined by snowball method.Experiment compares the coverage of the improved algorithm and TextRank algorithm in the sentence set of manual summary,which shows that the improved algorithm has raised the quality of summary extraction.
Keywords/Search Tags:Chinese automatic summarization, TextRank, discourse structure, literature abstract extraction
PDF Full Text Request
Related items