Font Size: a A A

Method Of Topic Sentence Extraction That Combined With LDA And TextRank

Posted on:2018-10-11Degree:MasterType:Thesis
Country:ChinaCandidate:Y K WangFull Text:PDF
GTID:2348330521451719Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The construction of topic model is the primary task and the key step of naturel language processing,it aims to extract the topic information in documents: topic terms or topic sentences,which is convenient for people to read and further apply text information.With the development of the Internet and the change of the media,social medias become the carrier for people to obtain and transfer information and the platform for the exchange of information.Among them,Micro Blog,as a social media which has the highest users number,user activeness and user adhesiveness in China,generates a large amount of text type unstructured data every day and has great information value,and it has become the focus of academic research.In addition,Micro Blog's text has a high information fragmentation degree.And the same type of subject will produce a lot of relevant Micro Blog or retweeting with comments.Therefore,it is of high scientific value to extract the key words of Micro Blog's information.In view of this research goal,this article has carried on the analysis and the comparison to several kinds of existing topic model algorithms,the main work is as follows:1.Aim at the web text feature of Micro Blog,this article proposes a kind of topic information extraction framework based the integration of LDA and TextRank algorithm.2.In view of the deficiency of TextRank algorithm in topic information extraction,this article combines the information of text and the weight contained in text subject,A fusion algorithm is proposed to initialize the weight values of each vertex in the TextRank algorithm by us ing the weight influence factor calculated by the LDA algorithm.3.By comparing the LDA algorithm,TextRank algorithm and the proposed fusion algorithm in the topic model through experiment,the effectiveness of the fusion algorithm is verified.The main work of this paper provides a supplement for topic modeling algorithm,and shows that the fusion of different topic model algorithms is more effective for text topic information extraction,and provides a new way for semantic analys is and mining of a large amount of network texts represented by Micro Blog.
Keywords/Search Tags:Thematic model, LDA, TextRank, Micro Blog
PDF Full Text Request
Related items