Font Size: a A A

Technologies Research On Crowd Analysis In Online Social Networks

Posted on:2016-09-22Degree:DoctorType:Dissertation
Country:ChinaCandidate:L M ZhangFull Text:PDF
GTID:1108330509961080Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of Web 2.0 and Internet, online social networks have become main sources for users to gain information, discuss public events and express their opinions. Online social media, such as Twitter, Facebook, etc., have transform people’s relationship in real society into online Internet, which results the formation of cyber society. Users are core and basis of online social networks. For example, the ”follow”relationships construct the topological structure of social network; the tweets and other contents are main information in Twitter; interactions like retweet and reply between users spread up the diffusion of information. Crowd, formed by users, is the main way for individuals to gain respect and show their values. Due to the rapid development of online social networks, individuals can interact with each other to form crowds quickly, and become a new force to promote the revolution of a country. Therefore, crowd analysis in online social networks plays an important role for public security.Crowds, is a set of users who share similar interests and interact with each other in online social networks. In this definition, interests and interactions are two basic features for individuals to join into crowds. Sentiment, behavior and content are three factors of crowds. In this paper, we will systematically introduce some key technologies about crowds formation, sentiment evolution, user behavior analysis and bursty topic detection.The main contribution of this paper are as follows.First, we propose a framework to detect dynamic interaction communities in online social networks. Here, the dynamic interaction community, defined as a set of users in a time interval such that the users are connected by interactions based on documents that share similar topics, is essentially the same as crowds. Our central idea is to organize interactions into interaction trees according to their content and propagation. Then, interaction trees are clustered into communities. We monitor evolvements of communities over time by maintaining a list of active communities at each time interval. Specifically,we first develop a Bayesian generative model, called Latent-Topic-Interaction(LTI), to model the interaction trees and their topics, and group interaction trees into intra-interval communities by clustering. At any new time interval, both topic similarity and structure relationship are used to determine whether a community expands or not. An extensive empirical study using the real Sina Weibo data containing more than 14 million microblog posts, reposts and comments during the 2012 London Olympics clearly demonstrates that our method is capable of tracking interesting evolving communities in a large dynamic social network.Second, we propose a framework to study the problem of sentiment evolution analysis in social networks. Peoples’ s attitudes towards public events or products may change overtime, rather than staying on the same state. Given a certain public event or product,a user’s sentiments expressed in the data stream can be regarded as a vector. We firstly present a multidimensional sentiment model with hierarchical structure to model user’s complicate sentiments. Based on the model, we use FP-growth tree algorithm to mine frequent sentiment patterns and perform sentiment evolution analysis by timing analysis using Kullback-Leibler divergency. Experimental evaluations on real data sets show that sentiment evolution could be implemented effectively using our method proposed in this article.Third, for the user behavior, reblogging, also known as retweeting in Twitter parlance, is a major type of activities in many online social networks. Although there are many studies on reblogging behaviors and potential applications, whether neighbors who are well connected with each other(called “buddies” in our study) may make a difference in reblog likelihood has not been examined systematically. In this paper, we tackle the problem by conducting a systematic statistical study on a large SINA Weibo data set,which is a sample of 135, 859 users, 10, 129, 028 followers, and 2, 296, 290, 930 reblog messages in total. To the best of our knowledge, this data set has more reblog messages than any data sets reported in literature. We examine a series of hypotheses about how essential neighborhood structures may help to boost the likelihood of reblogging, including buddy neighbors versus buddyless neighbors, traffic between buddy neighbors, activeness(i.e., the total number of blog messages a user sends), and the number of buddy triangles a user participates in. Our empirical study discloses several interesting phenomena that are not reported in literature, which may imply interesting and valuable new applications.Fourth, for the public events, how to effectively and efficiently detect the online public events in massive data streams has become a hot research area nowadays. We proposed a novel approach to mine online events based on emoticons. Emoticons in texts streams always burst with hot events, so we could monitor the states of emoticons and quickly mine the bursty periods so as to detect events. Firstly, we built an emoticon model based on frequent patterns mining and mutual information, and detected their periods using Kleinberg’s method. Then, we used Heuristic Affinity Propagation(HAP) to cluster and aggregate events. Besides, a recycle module was proposed in the last part of the frame so as to make precise event abstraction. Experimental results show that our algorithm can detect online events in microblog streams effectively, and could meet the needs of realtime process both in speed and accuracy.In summary, this thesis presents technical solutions to several essential issues of crowds in online social networks. The experiments demonstrate that our methods can properly achieve their goals. It is significant to the theoretical research and practical applications on crowd analysis in online social networks.
Keywords/Search Tags:Social networks, crowd, sentiment analysis, user behavior, topic
PDF Full Text Request
Related items