Font Size: a A A

Research On Predicting The Popularity Of Sina Weibo Topic

Posted on:2017-02-27Degree:MasterType:Thesis
Country:ChinaCandidate:Y TanFull Text:PDF
GTID:2308330488482437Subject:Information Science
Abstract/Summary:PDF Full Text Request
In recent years, Sina weibo transforms from pop to mainstream, and a household name. It has changed our lifestyle of online and offline, and even replaces the traditional media, becomes our primary channel to publish and obtain information. Although microblogging information is fragmented and scattered, but when a large number of microblogging are discussing the same topic, these fragmented pieces of information will be able to gather together, disseminate the topic rapidly, so that the topic become the focus, resulting in a new voice. Because of Sina weibo’s popularity as well as the influence of hot topics, predicting which topic will become popular in the near future contains a huge business opportunity, and becomes an important task of social marketing and public opinion supervision.Current research on prediction of Sina weibo topic mainly focuses on temporal trends of information flow, on opinion leaders and on the topology of the microblogging graph. In this paper we summarized the factors that make topic popular based on studies above. We propose methods to predict the popularity of topic by formulating the features of a topic as a classification task. We use five standard classification models (Naive Bayes, k- Nearest Neighbor, C4.5, Logistic Regression, and Support Vector Machine) for prediction. The main challenge is the identification of effective features for describing a topic. First, we analysis the factors that affect the popularity of the topic of Sina microblogging. After visualizing the propagation path of hot microblogging, we found early popularity of microblogging, user influence and the topic attributes are important factors in promoting popular topic. Based on these three factors, we extract features from the early disseminate dynamic of topic, user influence and topic content, construct three complementary feature subset.We collected 2166 topic, nearly 1.625 million microblogging data from Sina weibo platform to conducting experimental analysis of prediction mode. The experimental results show that the standard classifiers using the extracted features significantly outperform the baseline methods that input feature subsets. Among the five classifier, the C4.5 performs best in terms of the F measure.In the First chapter, thesis introduces the Sina microblogging for rapid development and reasons of popularity, and made review of domestic and foreign research on trends topic prediction. In the second chapter, we introduced the theory of Microblogging and classification prediction technology. In the third chapter, we discusses the factors that affect the popularity of microblogging topic. In the fourth chapter, a formal definition of microblogging related concepts, and a detailed of feature calculation are given. We proposed a framework and process for trends topic forecasting model. In the fifth chapter, we conducted an experiment about prediction model, including process of Sina weibo crawler framework and data cleaning. In the end of this thesis, a short summary is given to show the main work and innovation of this paper, and propose future direction of improvement work.
Keywords/Search Tags:Sina microblogging, trending topic prediction, feature construction, classification
PDF Full Text Request
Related items