Font Size: a A A

Research On User Behavior Analysis And Network Evolution In Microblogging Networks

Posted on:2015-12-26Degree:DoctorType:Dissertation
Country:ChinaCandidate:W G YuanFull Text:PDF
GTID:1488304310496384Subject:Information networks and security
Abstract/Summary:PDF Full Text Request
:Driven by the fast development of the Web2.0and mobile network technology, Microblogs have been the most popular form of online social networking. As a self-media, users in Microblogs networks can participate in the interactions with other individuals anytime, anywhere and by utilizing a variety of access methods. Anew kind of complex network constitutes by user interaction become more flexible and quick. At the same time, user behavior and network structure has a direct influence on the process of information spreading. In view of this, we use the interdisciplinary ideas and methods to study user behavior and network evolution in Microblogs networks, trying to find out their statistics features, to reveal the underlying mechanism dominating their evolution, to establish mathematical models which can characterize these laws, and to put forword relevant strategies which can predict user behavior. Our work may help to understand the user behavior characteristics and the evolution process of network structure in Microblogs networks, to also provide some of exploratory theoretical results for the study of complex systems. The work of the dissertation is supported by the National Natural Science Foundation of China (No.61172072,61271308), Beijing Natural Science Foundation (No.4102047,4112045), and the Fundamental Research Funds for the Central Universities (No.2011YJS215). Main contributions of the dissertation are as follows:1. We study the user characteristics and posting behavior, and present a model of user posting behavior in microblog. Firstly, empirical analysis reveals statistical features and the relations of user characteristics. Secondly, it is found that the interval distribution of user's posting behavior follows power-law at both individual and group level, and there is positive correlation between users' active and the power-law exponent. The user's posting time series has self-similarity characteristics and there is also periodicity on user's posting behavior. Further study show that interval time distribution exponent is positively correlation with user interaction, and user's interest is also influenced by retweet and comment behaviors. Considering these effects, we dicussed an improved model based on social-driven, interest-driven effect and the analysis results. We also proposed anther model where user interest changes with Logistic function. These models can restore the basic characteristics of the interval between statuses releases in the Microblogs networks. 2. We study the user characteristics and growth rates distribution. Based on the actual data from Sina Weibo, we studied the distribution of three users'characteristics, such as the number of followers, friends and statues, which are subject to the double power-law distribution and different types of users with various features. It is found find that the double Pareto lognormal (DPLN) distribution can better fit the overall distribution of user three characteristics than the lognormal distribution and power-law distribution. The user activity span is found to be exponentially distributed and the number of these three users'characteristics approximately follows the lognormal distribution in the different active spans. Furthermore, it is observed that these users'characteristics growth rates follow lognormal distribution and are independent with users'characteristics. This phenomenon is consistent with the double Pareto lognormal distribution model and can explain the formation mechanism of the use characteristics distribution. Moreover, the users'number of different growth patterns can be counted using the K-means clustering algorithm, which is based on the vector cosine similarity. The growth patterns of user characteristics are observed by cluster analysis of the actual time series, which are grouped by different sorting methods and initial scales. It is observed that the users with higher growth rate are mainly in explosive growth pattern, and the users with higher initial number tend to be in sustainable growth pattern. Based on the analysis of the explosive growth process of the number of followers, the relationships between the growth of the numbers of retweet and comment are compared, and the reasons for the explosive growth of the users are proposed. Finally, another significant finding is that the distribution of cumulative sum of followers, friends, and statuses follow a strict power-law form, which indicates an allometric growth phenomenon.3. We study the nodes centricity characteristics and identify the most influential nodes for spreading dynamics. First, two bidirectional user relationship networks were established base on actual data from Sina Weibo. By analyzing the statistical characteristics of the network topology, we find both of them have a small world and scale free characteristics. Moreover, we describe four network centrality indicators, including node degree, closeness, betweenness and K-Core. Through empirical analysis of four centrality metrics distribution, we find that the node degrees follow a segmented power-law distribution; betweenness difference is most significant; both networks possess significant hierarchy, but not all of the nodes with higher degree have the greater K-Core values; strong correlation exists between the centrality indicators of all nodes, but this correlation is weakened in the node with higher degree value. Finally, the two networks are used to simulate the information spreading process with the SIR information dissemination model based on infectious disease dynamics. The simulation results show that there are different effects on the scope and speed of information dissemination under different initial selected individuals. We find that closeness and K-Core can be more accurate representations of the core of the network location than other indicators, which helps us to identify influential nodes in the information dissemination network.4. We present an evolution model based on community structure and mixed connection mechanism in Microblogs networks. Based on the user profile data collected from microblogs, we find that the number of microblog user bidirectional friends approximately corresponds with the lognormal distribution. Furthermore, we builds two microblog user networks based on real bidirectional relationships, both of which have not only small-world and scale-free but also some special properties, such as double power-law degree distribution, disassortative network, hierarchical and rich-club structure. Moreover, by detecting the community structures of the two real networks, we also find their community scales follow an exponential distribution. Based on the empirical analysis, we propose a novel evolution network model with mixed connection rules, including lognormal fitness preferential and random attachment, close neighbor interconnected growth in the same community, and global random associations in different communities. The simulation results show that our model is more consistent with real networks. By adjusting the parameters of model, we can generate simulation networks with different degree distributions and clustering coefficients.
Keywords/Search Tags:Microblog Networks, User Behavior, Growth patterns, ComplexNetworks, Centrality, Evolution model, Information Dissemination
PDF Full Text Request
Related items