Font Size: a A A

Research On Hot Topic Discovery Technology Of Micro-Blogging Network

Posted on:2015-01-12Degree:MasterType:Thesis
Country:ChinaCandidate:X X LiFull Text:PDF
GTID:2268330425988920Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
As a product of WEB2.0, micro-blog is developing rapidly these years. More and more information spread on the micro-blog because of its high speed and convenience, social hotspots and news events included. As a result, discovering, extraction and analyzing information become researching hotspots. This paper research about how to discover hot topics in the micro-blog is significant. This paper includes:1. By studying micro-blog text and long text cluster, this article draws a conclusion that traditional cluster algorithms cannot be used to discover topics because of the length of text. Therefore, this article proposes a solution which is based on the expansion of the comments and HowNet lexeme. By this method, the short text and diversified expression can be overcome. Besides, this article proposes a solution which is based on text cluster.2. This article researches for traditional cluster algorithms and analysis their features. Aiming at micro-text’s features, this article proposes a K-means algorithm based on BIRCH initialization. This algorithm solves the problem that setting the parameter k manually and optimizes the options of the cluster center. Furthermore, the noise immunity of the algorithm is improved and K-means algorithm reduces the influence of the results from the input order.3. This article studies the features of the micro-blog’s topics, the spreading process and analysis every factor that may influence the heat of the micro-blog.In addition, this article proposes an evaluating model and detailed a calculating formula about the heat of the micro-blog.4. According to the researches, this article designs a system which is used to discover micro-blog hot topics using JAVA language and demonstrate the results by B/S mode.This work has been supported by the National Natural Science Foundation of China under Grant61172072,61271308, and Beijing Natural Science Foundation under Grant4112045, and the Research Fund for the Doctoral Program of Higher Education of China under Grant W11C100030, the Beijing Science and Technology Program under Grant Z121100000312024, and Beijing Municipal Commission of Education Discipline Construction and Graduate Construction Project.
Keywords/Search Tags:Microblogging short text, Clustering algorithm, hot topics
PDF Full Text Request
Related items