Font Size: a A A

Modeling And Application For Information Diffusion In Large-scale Social Networks

Posted on:2019-08-01Degree:DoctorType:Dissertation
Country:ChinaCandidate:J LiFull Text:PDF
GTID:1368330551456738Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of internet technology and the fast growth of internet users,online social network and social media websites have already become a major role in the spread of information at present.The information dissemination in online social networks has many characteristics that are significantly different from traditional mass media,which has attracted a large number of researchers to carry out related research.The research of information diffusion in online social networks can help business people make more powerful marketing strategies,help users more easily discover valuable information and help government restrict the spread of harmful content.Although the existing research has achieved a lot of results,there are still many deficiencies in the study of information diffusion in social networks,such as identifying key factors in information diffusion and modeling the information diffusion process.The thesis aims to provide insights into information diffusion on online social networks in two ways:to quantitatively analyze the key factors that affect information diffusion,and to model the information diffusion process.Specifically,the main contributions of the thesis are presented as follows.The first part studies the structure and evolution of very large cascades in online social network.We propose the "grapes" model which roughly describe the diffusion process of large cascades and summarizes four diffusion structural patterns,each of which represents a typical cascade structure and reflects different diffusion mechanism.We investigate over 45000 large cascades,and their sizes range from thousands to hundreds of thousands.We found that the underlying network structure of most large cascades are fairly sparse and less clustering,and there are two surprising common phenomena in these diffusion process:first,even if these popular events spread widely,their "infection rate" are still very small;second,the probability of a node being infected does not increases linearly but be somewhat persistence with the number of repeated exposures.These phenomena significantly support the "grapes" model.Finally,we made a comparison between large and small cascades.The results suggest that the structure features are not a key factor in predicting the future growth of cascades.The second part studies the effect of the content of the message on the information propagation.A number of features are extracted based on the text of content and user behavior data to indicate the attractiveness of the content.The experiment on user repost behavior prediction task shows that content attractiveness has a significant influence on the information diffusion in social network.We also propose several schemes to train the topic model to infer user's preference and the matching degree between the user's interest preferences and content.Using these matching degrees as features,it can get a 7%to 14%improvement in user behavior prediction.The experimental results demonstrate that the matching degree plays an important role in the diffusion process.The third part measures the effects of external influence in Information diffusion.We develop an algorithm which allows us to distinguish the effects of external influence in diffusion process.By applying the algorithm to millions of diffusion cascades,we get four valuable findings.First,although only a small portion of reshare activities arise from external influence directly,external influence plays a significant role in information diffusion.In particular,external influence affects nearly 50%to 70%of cascade node in average,and the effects become stronger as the cascade becomes larger.In addition,external influence motivates users to reshare from strangers and improve the odds of being friends between them,which will lead the underlying network denser and benefit the diffusion process.Furthermore,we characterize external influence as two categories:one category mainly affects the size of cascade tree and the other focuses on affecting the depth.At last,we find that,due to the external services,the influentials become less important and more large cascades can be triggered by ordinary people.Finally,based on the above analysis of the key factors of information diffusion on large-scale social networks,we propose a novel Diffusion-Latent Dirichlet Allocation(D-LDA)model,which integrates both content topic inference and social influence computing in the same generative process.The iteration of this model alternates between two steps:U-Part step infers the content and users' topic distribution,and D-Part step calculates the peer's influence and the attractiveness of content.The model has the ability to distinguish the content-related and influence-related factors which affect the user reshares the content.We use Gibbs Sampling to derive the parameters of the model.The D-LDA model is evaluated through multiple experiments on large-scale dataset.The experimental results show that the D-LDA model can converge quickly after a few iterations and produces significantly higher quality results than the prior models.
Keywords/Search Tags:social network, information diffusion, attractiveness of the content, external influence, Diffusion-Latent Dirichlet Allocation
PDF Full Text Request
Related items