Microblogs Data Acquisition And Dissemination Behaviour Modeling

Posted on:2015-03-26

Degree:Master

Type:Thesis

Country:China

Candidate:H X Ma

Full Text:PDF

GTID:2308330467471469

Subject:Computer software and theory

Abstract/Summary:

Due to the rapid development of Web2.0applications, social media as a platform for people to record life, share information and make on-line friends has draw great attention of business, politics and academics. Social media could help to understand a userâ€™s social relationship, on-line behavior and preference, and thus contribute to the recommendation systems such as friends, products and services recommendations. Furthermore, social media is able to be employed to predict on-line collective behavior by studying historical information diffusion patterns, so as to keep a harmonious society. And therefore, the collection of social media data and the notice of on-line collective behavior is one of the most critical and urgent research topics.Traditional sampling techniques fail to be directly applied to collect social media data because of its strong relationship dependency feature. Besides, our task is challenging due to the volume, velocity and variety of social media. Microblogging services, as one of the most typical social media platforms, have most social mediaâ€™s characteristics. This thesis mainly focuses on data collection and information diffusion pattern discovery. Our contributions are listed as follows.Â· A structure-based data acquisition method for social media is proposed and imple-mented. According to the Weakly Connected Component(WCC) theory, the dis-tributed crawler starts from the selected seed users, and then extend in the followee network based on the breadth-first criteria. The collected data set are published and employed for further discussions.Â· The formalized definition of the popularity for microblogs is provided. The defi-nition considers both the retweet number#retweet and the possible view number#pv. Moreover, from observations, we draw a conclusion that tweets with larger#retweet would have larger#pv.Â· Life cycle and tipping points of tweets are studied. The results indicate that for most tweets with larger#retweet, their life cycles are less than48hours. In addition, tweets may have the tipping point, which is a burst in the process of the diffusion. The distribution of the retweet volume over time follows Sigmoid function based on real data, and thus Sigmoid function are employed to fit the tendency. The estimation of the parameters for the algorithm are provided and the experimental results show that our model and parameter estimating method could achieve high precision. Â·A resource library for analysing on-line collective behavior is developed. It is able to illustrate an event based on time, location, sentiment and diffusion network. We also provide a demo system to visualize the evolvement of an event, including the event participants, peopleâ€™s attitude, influential users and etc.This thesis explore the feasibility of data collection of microblogging systems. Based on the collected data and the proposed definition of popularity, we model the information diffusion in literature and study the life cycle and tipping point. Finally, an open resource library for collective behavior analysis is established. The visualization demo system indicates the role of social media in studying user collective behaviors in multiple aspects.

Keywords/Search Tags:

Social Media, Crawler, Popularity, Collective Behaviors, Modeling

Related items

1	Analyzing And Modeling Of Popularity Of Social Media Content
2	The Study Of Collective User Behaviors On Social Media Under Emergencies
3	Sampling Social Media Data For Hot Spot Sensing And Analytics
4	A Study Of Collective Mourning Behavior In Social Media
5	Research On The Method Of Improving The Social Popularity Of Video Based On YouTube
6	The Research Of Popularity Prediction Algorithm Of The Message On Social Media
7	Research On Network Collective Action In Chinese Social Media
8	The Application Of Collective Intelligence In Social Media
9	The Construction Of Collective Memory In The Network Era By Integrated Media
10	Research On Social Media Usersâ€™ Post Adoptive Switching Behaviors