Font Size: a A A

Research On Hot Topic Discovery Of Stack Overflow Based On CBOW-LDA Topic Model

Posted on:2019-04-08Degree:MasterType:Thesis
Country:ChinaCandidate:J ZhangFull Text:PDF
GTID:2428330545997137Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the development of Internet technology,various kinds of website provide an irreplaceable way for user to publish,search and obtain information.Stack Overflow is a popular foreign programming Q&A website favored by programmers,it also offers us a platform to post questions and seek solutions.In terms of different problems in Stack Overflow,some become hot topic which can reflect the demand and trend in the field of programming.Therefore,we can gather the hot programming knowledge which the developers focus on by studying the programming question text semantic mining,which can make contributions to getting an understanding of software information and related hot research.Owing to the high dimensionality problem and the self—defect of LDA,it is difficult to detect topics from a large number of short texts in social network.We proposed a new model based topic detection method called CBOW-LDA,which can cluster similar words by vectors similarity before topic detection.It decreases the dimensions of LDA output and make topics more clearly.Through the analysis of topic perplexity in the experiment dataset about the post on Stack Overflow in 2010-2015,the results show that topics detected by CBOW—LDA method has a lower perplexity,comparing with word frequency weighing based vectors named TF—LDA.When acting CBOW-LDA method in hot topic on Stack Overflow,we establish a manual annotation standard evaluation set to contrast experiment results.We confirmed that the CBOW-LDA method had better effect because each measure value of CBOW-LDA is better than TF-LDA,which proves that the CBOW-LDA has better performance in both algorithm using and hot topic mining.Also,through our experiment we effectively find out the hot issues of the theme and hot words in nearly six years,then we design and complete the hot topic study prototype system based on CBOW-LDA.In a word,this paper provided some valuable ideas or methods to the research on topic modeling and hot topic mining,which has certain research significance and reference value.
Keywords/Search Tags:Stack Overflow, LDA-CBOW model, topic detection, hot programming topic
PDF Full Text Request
Related items