Font Size: a A A

Research On Influence Measurement And Popularity Prediction In Online Social Networks

Posted on:2016-02-11Degree:DoctorType:Dissertation
Country:ChinaCandidate:S GaoFull Text:PDF
GTID:1108330461985524Subject:Computer Science and Technology
Abstract/Summary:Request the full-text of this thesis
With the rapid development of Internet technology and the coming of Web 2.0 era, various social media are constantly emerging on the Internet, which have profoundly changed the way that people access information and communicate with each other. Serving as a new spreading medium, social media have remarkably weakened the dominate position of traditional media in information spreading process, which brings about a new network spreading pattern for information, i.e. users can be both the source and audience of information. The emergence and development of social media have greatly reduced the cost of information spreading in online social networks, resulting in some typical characteristics for online information spreading process, such as fast spreading speed, widely coverage and deep social influence etc. However, recent studies have shown that the popularity of information in online social networks follows a power-law distribution, which implies that only a few pieces of information can go extremely viral, while a large amount of information cannot grow popular. This phenomenon raises the following issues that draw our attention:which pieces of information can be popular and diffuse the most, namely whether we can predict the popularity of information from its spreading process in the early stage; among all these factors that involve in information spreading process, which are the key factors that affect the popularity of information; how to measure the spreading ability of users in the network and further identify users who play important roles in the spreading process. The above issues relate to the research on influence measurement and popularity prediction in online social networks, which are currently the hot spots of research at home and abroad, and also are the main research issues concerned in this thesis.Supported by NSF, this thesis investigates two key research issues of information spreading process in online social networks:influence measurement and popularity prediction. The main research contents and innovations of this thesis are listed as follows:1. We propose a novel local network structure based nodes’influence measure.In order to achieve an effective and efficient measurement for the influence of nodes in large-scale online social network, we propose a local network structure based local structural centrality measure. Distinguished from existing local metrics, the proposed measure considers both the topological structure information of the local network around a node and the influence feedback from its nearest neighbors. Specifically, the local network around a node is constructed by the node itself together with its nearest and next nearest neighbors. The topological structure information of the local network includes the number of nodes in the network and the topological connections among these nodes. The influence feedback from the nearest neighbors means that the influence of a node is the sum of its nearest neighbors’relative influence. To evaluate the effectiveness and robustness of the proposed measure, we conducted extensive experiments on real networks of various sizes and artificial networks of various sizes, various degree distributions and various community structures. The experiment results show that, the proposed measure can better measure the influence of nodes than other centrality measures such as degree, k-shell, betweenness, closeness and local centrality, and its performance is robust to different types of networks. Further, the experiment results show that the proposed measure can better distinguish the influence of nodes and can better rank the most influential nodes. Since the proposed measure only considers local network around a node, it costs much lower time complexity and can be applied to large-scale networks directly.2. We investigate the effective features for popularity prediction problem on microblogging platforms.By formulating the popularity prediction problem as a classification problem, we investigate the effective features for classification. Specifically, we consider two prediction tasks, i.e. predicting the popularity of a message based on the observation in 1 hour after been posted (PP1H) or the observation of its first k retweets (PPkR). We formulate the former task as a multi-class classification task which predicts the popularity range for a message and the latter task as a binary classification task that predicts whether or not a message will be popular. Then we apply five standard classifiers (i.e. naive bayes. K-nearest-neighbor. support vector machine. logistic regression and bagging decision tree) for classification. In order to identify effective features for classification task, we investigate a wide spectrum of features, including retweet network features and border network features extracted from the underlying user network, and temporal features derived from the observed retweets. Further, to eliminate the impact of user activity variation on classification task, we introduce the notation of weibo time and use it to measure the temporal features. Experiments on Sina Weibo dataset show that:for PP1H task, bagging decision trees with all features yields the best classification performance, and border network features are more effective than the other two groups of features; for PPkR task, a satisfied classification performance can be obtained by only using the temporal features of the first 10 retweets. Further, the experiment results show that by introducing weibo time, the classification performance of temporal features can be significantly improved.3. We present a reinforced Poisson process based popularity prediction model for weibo message.The presented model studies the popularity prediction problem from the perspective of modeling the time series corresponding to the retweeting dynamics of weibo messages. Specifically, based on reinforced Poisson process, the presented model characterizes the process through which a message gains its popularity, including three key ingredients:(1) fitness of the message, depicting its inherent competitiveness against other messages in attracting users’attention; (2) a power-law temporal relaxation function, corresponding to the aging effect of its novelty; (3) an exponential reinforcement function, capturing the preferential attachment phenomenon in its retweeting dynamics. Further, we again introduce the notation of weibo time and integrate a time mapping process into the presented model to eliminate the impact of user activity variation. Extensive experiments on Sina Weibo datasets show that the presented model can well model the retweeting dynamics of weibo messages and outperforms other prediction models in predicting the popularity of messages. Further, by integrating the time mapping process, the prediction performance of the presented model can be significantly improved.
Keywords/Search Tags:Online Social Network, Information Spreading, Influence Measurement, Popularity Prediction, Reinforced Poisson Process
Request the full-text of this thesis
Related items