Font Size: a A A

Research On Trend Prediction Of Micro-blogging

Posted on:2016-05-22Degree:MasterType:Thesis
Country:ChinaCandidate:M M YangFull Text:PDF
GTID:2308330476453384Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
This paper studies the popularity prediction of events on microblog platform,where high quality predictions give us much more ?exibility and preparing time in deploying limited resources(such as advertising budget, monitoring capacity) into more popular events. However, the high retrieval cost of data used in prediction is a big challenge due to the large amount of users and microblogs involved. We propose a notion that popular events can be predicted by concentrating on only a few but informative users, as we notice the fact that the production frequency and amount of microblog,that belong to some events, varies from user to user. This paper conducts a thorough research on popularity prediction of events and reducing the cost used to monitor users for retrieving data used in prediction.In user selection for monitoring, we define the problem of user selection used in data retrieving for popularity prediction. Then we present a Coverage Degree of Events and Reposts based User Selection algorithm(CDERUS), which take into consideration Coverage Degree of Events(CDoE) and Coverage Degree of Reposts(CDoR) when the monitoring cost budget is limited. The selected users are used to monitor more events and to fetch more repost links. We implement Benefit/Cost Ratio based User Selection algorithm(BCRUS), choose Fans Number algorithm(FN) as baseline, evaluate and compare the performance on the data retrieved from Sina Weibo Microblogger. We obtain these results:(1)When monitoring cost C = 10000, CDERUS algorithm has an improvement of 0.6% and 6.0% on recall of events separately over BCRUS algorithm and FN algorithm.(2) When monitoring cost C = 10000, CDERUS algorithm improves the number of event reposts by 3.2% and 14.3% separately over BCRUS algorithm and FN algorithm.In popularity prediction, we describe an Improvement of State Transition Based Prediction(ISTBP) algorithm. The improvement on STBP contains two aspects: a) in the clustering stage, we use Spectral Clustering algorithm where the cluster number can be controlled; b) in the prediction stage, we present a new state decision algorithm based on scoring. To solve the limitation problem of prediction time resolution in ISTBP, we consider this prediction problem from the view of considering matching in time dimension and then the event dimension, and then present Item-Temporal Based Prediction algorithm(ITBP), which firstly finds the most similar part of template events with the event to be predicted in a sliding way, secondly aligns those events at these parts and at last uses the weighted sum of the template events as the predictions. We evaluate and compare the performance using data retrieved from Sina Weibo Microblogger by monitoring the selected users and all users. We get these results:(1) When TG= 6h(time length of known part) and TP= 12h(time to be predicted), Root Mean Square Error(RMSE) of ITBP and ISTBP are separately reduced by36.8% and 35.8% than that of STBP on dataset retrieved by monitoring all users.(2)When TG= 6h(time length of known part), TP= 12h(time to be predicted) and C = 3000(monitoring cost budget), using ITBP, our CDERUS algorithm has a 5.1%and 9.9% smaller RMSE than BCRUS algorithm and FN algorithm.(3) RMSE of ITBP is only 9.9% smaller using data retrieved by monitoring all users(whose monitoring cost is about 5.47 million, calculated using about 120 thousand users after preprocessing) than by monitoring users selected by CDERUS under C = 10000.
Keywords/Search Tags:Microblog, popularity prediction, user selection
PDF Full Text Request
Related items