Font Size: a A A

Based Microblogging Hot Topic Detection Algorithm And Implementation

Posted on:2015-12-07Degree:MasterType:Thesis
Country:ChinaCandidate:J LiFull Text:PDF
GTID:2428330488499822Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the continuous development of social networking and mobile Internet,more and more people actively participate in social life.It is a challenge for news media and it's an opportunity to convert to new media.How to more quickly find the reports or valuable news and information has become a media problems to be solved.The news value often depends on the real-time nature of it.in order to obtain the most valuable news,we must only fastly access to social hot spots and hot trends.In this paper,based on the finding hot topic algorithm from microblogging we find the social hot topics in real time from the microblogging,and timely dig out the focus of the society and social public opinion to tap valuable news.The main work is as follows:Fristly,Social network theory,the theory of social network theory of Six Degrees of Separation and the world;analysis of network reptiles,analysis of web crawler is divided into recursive and non-recursive implementation;web parsing techniques,analyticalweb crawler is crawling pages of information technology in five categories;most important topic model four implementation model,and compare the advantages and disadvantages of each of these four models.Secondly,design topic detection and recognition model and algorithm topic tracking and identification algorithms and potential hot spots hot topics found.Boolean model and vector space model dialogue,comments on the topic noise filtering;noise queue sentiment analysis,will be regarded as noise reviews back in the reviews queue;Finally design hot topic detection algorithm and potential hot topic topic of tracking and identification algorithm.Thirdly,By Sina microblogging micro-topic crawling,the number of micro-topic comments and forwarding number.In this paper,the design of topic detection algorithm top topic on Sina Weibo re-ranking,to come to a reasonable conclusion,dialogic discovery algorithm was validated.The innovation of this paper are follow as:(1)This paper presents a topic Comments denoising algorithm,this algorithm can be divided into the review of the topic related topics and related topics on the heat the value HotDegree calculated very useful.(2)In this paper,incremental observation algorithm(IOA)to discover trends hot topic.Become a hot topic for a topic,a process,how to judge the trend is mainly to solve the algorithm.
Keywords/Search Tags:Microblogging, Web crawler, Website parsing, Topic model, Hot topic discovery algorithm
PDF Full Text Request
Related items