Font Size: a A A

Research And Implementation Of Information Propagation With Retweet Probability On Online Social Network

Posted on:2015-09-11Degree:MasterType:Thesis
Country:ChinaCandidate:R H HouFull Text:PDF
GTID:2308330464964625Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The rapid development of social network has attracted a lot of people attention.Sina,for example,the number of users has reached 600 million.Therefore,more and more researchers will study online social network.On online social network, people can discuss their ideas, express their interests and opinions, all of which generated a lot of social data. So how to accurately model information propagation cascade has become a hot topic. Retweet is the behavior of atoms, consisting of information dissemination.In this thesis, we firstly obtained the retweet probability which affected by some factors,and then put forward the information dissemination model based on behavior, and finally in a real data set of information dissemination simulated. Experiments show that the proposed model can overcome the traditional fixed probability model to bring homogeneity, to better simulate the real process of information dissemination.Details are follows:Crawl microblogging data.This thesis set up Hadoop-based distributed data microblogging platform crawl. In the eight machines, micro-blog user data, relational data and text data in a crawl.The analysis of retweet. By analysing the microblogging data, we can select the appropriate features and the corresponding models.Using the corresponding features and the model, the retweet probability of the user is obtained.In this thesis, we use logistic regression model to fit the data for every user.Firstly, we should pre-process the data,then we can obtain the related features and we use the logistic regression model to train the data,finally, the retweet probability can be obtained.The experiment of information propagation. We should extract the network structure from the data. Because forwarding between users make up the dissemination of information, according to the forwarding behavior analysis, this thesis presents a SIS-p and IC-p propagation model. Finally, in the extracted network propagation simulation experiment performed.Some novel results have been obtained after this experiment. The homogenous retweet probability in the original model has underestimated the speed of information propagation, despite the scale of information propagation is almost at the same level.Besides, the initial information poster is really important for a certain propagation,which enables us to make effective strategies to prevent epidemics of rumor.
Keywords/Search Tags:Retweet probability, Online social network, Infectious model, Diffusion model, Logistic regression
PDF Full Text Request
Related items