Font Size: a A A

Sampling Social Media Data For Hot Spot Sensing And Analytics

Posted on:2017-01-29Degree:MasterType:Thesis
Country:ChinaCandidate:Y LiFull Text:PDF
GTID:2308330485968989Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet, social media has become the first choice for discussing social hot events. Users generate a lot of behavior data every day, such data play an essential role in understanding hot spots, information dissemination and user action. Social media data is an important way to sense social events. It is characterized by large amount, fast updated, strong timeliness. Data acquisition in social media is difficult while its cost is high. Effective data acquisition method can bring great improvement to the applications.Aiming at sensing hot events, we design a scalable framework for social media data acquisition and analysis. We study an adaptive method to sample data as much as possible in the limited resources. Main contributions of this thesis are listed as follows:1. A framework for social media data acquisition and analysis is designed. The framework contains the social media data sampling method, the social hot spot sens-ing method and the social user cluster behavior analysis method. The three methods are the whole process from access to analysis of social media data. The framework sample the social media data in dynamic way, it analyze the hot spots which are sensed in real-time.2. An adaptive sampling strategy for social media data is presented. There are many limitations in the social data sampling. In this thesis, the limited resource is allocated to sample three representative social streams. The methods analyze the characteristics of social media to adjust the sampling strategy. Compared with other methods, the adaptive method can get more data in sensing hot spots.3. Validates the sampling method from the aspects of the sampling rate and the integrity of the hot spot data. Based on the sampling method, an application system for online collective behaviors analytics is constructed. In this thesis, the sampling method is tested from three aspects:validity, real-time and integrity. The experimental data is from the previous work that our group collected. The data set is consist of 1.7 million users’s tweets. The experiment show that the adaptive data sampling strategy get the hot spot data more effectively. Meanwhile, application system based on data sampling method which can visualize real time event to sense public sentiment, understand the event evolving and analyze the behaviors of event participant.This thesis study the problems of social media data sampling, social hot spot sens-ing and social user cluster behavior analysis. Meanwhile, the adaptive social media data sampling method is presented. The online cluster behavior analytics system has run since June 2015. The system can help researchers sense real-time hot events, understand the development of events, and analyze the behavior characteristics of the event participants. It shows a great value to analysis the social public opinions and social sciences research.
Keywords/Search Tags:Social Media, Social Streams, Adaptive Sampling, Hot Spots Sens- ing, Collective Behaviors Analytics
PDF Full Text Request
Related items