Font Size: a A A

Research On Key Technologies Of Information Diffusion In Social Networks

Posted on:2019-11-01Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y SuFull Text:PDF
GTID:1368330551450042Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Online social networks are fundamental mediums for information diffu-sion,because the decisions of the social networking users tend to be influenced by their neighbors,and such influence can propagate through links and spread over the network.Understanding the patterns of information diffusion in so-cial networks has important social and economic values.This paper focuses on the problem of contagion adoption and the problem of revenue maximization.The problem of contagion adoption is to study if a user will adopt a contagion in a social network,and the problem of revenue maximization is to maximize the overall revenue from the purchasing behaviors of users under the influ-ence propagations.For the problem of contagion adoption,this paper proposes the Evolutionary Game Theoretic(EGT)model,the Interaction-aware Diffu-sion(IAD)model,and the Heterogeneous Users and Contagions Embedding(HUCE)model.For the problem of revenue maximization,this paper proposes the Calculation-based Randomized(CR)algorithm and the Calculation-based max-Heap updating Greedy(CHG)algorithm.Most studies focus on the scenario that only a singe piece of information spreads,regardless of the interactive effects among multiple contagions spread-ing at the same time.Multiple contagions may compete or cooperate to get propagated.Since individual users have limited attention and have been ex-posed to excessive information in online social networks,simultaneous news and events may have to compete for spreading.In addition to information com-petition,there are also some situations that information cooperates to get prop-agated.For the problem of contagion adoption,as the first work in this paper,an Evolutionary Game Theoretic Framework is proposed to model the interac-tions among multiple contagions.The basic idea is that different contagions in social networks are similar to the multiple organisms in a population,and the diffusion process is like organisms interact and then evolve from one state to another.This framework statistically learns the payoffs as contagions interact-ing with each other and builds the payoff matrix.Since the large amount of information in online social networks,fitting the payoffs for each pair of conta-gions is impossible.To address this challenge,a contagion clustering approach that based on the number of common infectious nodes is proposed.As a result,the cluster-to-cluster interactions are modeled instead of the full contagion-to-contagion interactions,which significantly reduce the number of parameters to learn and make our approach efficient and scalable.Experimental results on real-world Digg dataset show that the proposed game theoretic framework helps to comprehend the information diffusion process better and can predict users'forwarding behaviors with more accuracy than previous studies.The analyses of evolution dynamics of contagions and evolutionarily stable strategy reveal whether a contagion can be promoted or suppressed by others in the diffusion process.For information diffusion,social roles and their interactions are also vital.Since the diffusion of contagions is affected by network structures,and social roles of users reflect network structures,it is intuitive to involve social roles according to their structure characteristics,as well as their interactions to build a more comprehensive model.In addition,since each user has her own preference on contagions,there are interactions between users and contagions,which can also affect information diffusion.So far,we have three kinds of interactions,and how to integrate them together into a coherent model is a challenge for the problem of contagion adoption.The second work and the third work in this paper solve this problem from different perspective:the second work proposes a probability model,which is the Interaction-aware Diffusion(IAD)model;the third work proposes a feature selection model,which is the Heterogeneous Users and Contagions Embedding(HUCE)model.Besides integrating three kinds of interactions together into a coherent model,the second work IAD can also study the interactions among explicit categories of contagions.Compared to the implicit categories,what is more interesting is the interactions among explicit categories,namely whether con-tagions belonging to one category would have some effects on the spreading of contagions belonging to another category.However,to do this study,we would need to have the category of each contagion.Given the large number of con-tagions,it would be impossible to ask human to annotate all of them.How to find an efficient way to classify contagions with only minimum supervision is thus a challenge in this work.First,we apply a mixture of Gaussians model to explain the generation process of user network features,and use EM algorithm to extract social role distribution for each user.Then we propose a classifica-tion approach for contagions based on co-training,which uses a small number of labeled data and a large number of unlabeled data.After that,we achieve the category of each contagion and the social roles of each user.The proposed model statistically lerns the interactions,and the resulting data assists to bet-ter comprehend the information diffusion process and provide a more accurate prediction for contagion adoption.Experiments with large-scale Weibo dataset demonstrate that IAD outperforms the state-of-art baselines in terms of F1-score and accuracy,as well as the runtime for learning.In addition,the interactions obtained through learning reveal interesting findings,e.g.,food-related conta-gions have the strongest capability to suppress other contagions' propagation,while advertisement-related contagions have the weakest capability.Heterogeneous network embedding is the key idea of the third work HUCE.To study the different kinds of interactions among users and contagions,their representations should be learned simultaneously into the same latent space.HUCE introduces the representation learning techniques,jointly learns the rep-resentations of users and contagions based on a heterogeneous network con-sisting of two types of objects,users and contagions.Specifically,we first con-struct the heterogeneous network by the neighborhood relations,the historical contagion adoption behaviors of users,and the contents of contagions.We then propose HWalk,a novel random walk algorithm designed for the heterogeneous network,where meta-path-based proximity is used to guide the transitions be-tween neighboring nodes.With HWalk,a set of random walk paths is gener-ated with semantic meanings.Then we process these paths to encode different types of nodes in a low-dimensional continuous vector space.Finally,the la-tent representations of users and contagions are exploited as input features and an off-the-shelf classification algorithm can be applied to our prediction task.The HUCE model assists to better comprehend the information diffusion pro-cess and provides a new insight into the decision making of contagion adoption.Moreover,the obtained representations in our proposal can also be applied for other applications,e.g.,the multi-class classification for contagions.The eval-uation results on a large-scale Weibo dataset demonstrate our proposal can out-perform the competing baselines.Moreover,the latent representations are also suitable for multi-class classification of contagions.The research of the problem of revenue maximization in social networks is the fourth work in this paper.In recent years,social networks have become an important platform for information sharing and marketing,and many com-panies promote products and services through word-of-mouth effects in social networks.A user's valuation will be positively influenced by other users who have already purchased the commodity in networks.If a user's neighbor pur-chases the commodity,his valuation changes and he is more likely to purchase the commodity.Revenue maximization is the problem of devising the market-ing strategy to obtain the optimal revenue in social networks,by determining the price of a commodity and identifying a small set of influential vertices to give out commodities for free.As the scales of social networks become larger and larger,which leads to low efficiency and long running time in solving the problem of revenue maximization.As various real-world applications become more and more strict on the execution time of algorithms,it is necessary to study the efficient processing technology for revenue maximization.However,the current existing solutions are not efficient.In this paper,we propose efficient and effective algorithms for both the scenarios of the unlimited supply and the constrained supply.Firstly,a fast method of approximate revenue calculation using local acyclic graphs is proposed to replace the time-consuming simula-tions on influence cascades.The intuition is to restrict the influence spreading to a vertex within its local acyclic graph,and the activation probability of this vertex can be calculated approximately in its local acyclic graph instead of the original large network.Based on this method,for the scenario of unlimited supply,we propose an efficient algorithm,i.e.,the Calculation-based Random-ized(CR)algorithm,in which the revenues are approximately calculated based on local acyclic graphs.For the scenario of constrained supply,we develop an efficient algorithm called the Calculation-based max-Heap updating Greedy(CHG)algorithm.The main idea of this algorithm contains two crucial parts:(1)fast approximate revenue calculation based on the local acyclic graphs;(2)max-Heap updating scheme,which relies on the submodularity of the objective function of the revenue maximization problem.Experiments on both the syn-thetic and real-world datasets demonstrate the efficiency and effectiveness of our proposals,that is,our algorithms run in orders of magnitude faster than the state-of-art baselines,and meanwhile,the maximum revenue achieved is nearly not affected.
Keywords/Search Tags:social networks, information diffusion, analysis and mining
PDF Full Text Request
Related items