Font Size: a A A

Understanding the spatio-temporal characteristics of twitter data with geo-tagged and non geo-tagged content: Two case studies with the topic of flu and Ted (movie

Posted on:2017-03-16Degree:M.SType:Thesis
University:San Diego State UniversityCandidate:Issa, EliasFull Text:PDF
GTID:2468390011989956Subject:Geographic information science and geodesy
Abstract/Summary:
The dynamic characteristics of Twitter messages have the potential to provide GIS scientists with a great research opportunity to analyze the diffusion of events such as disease outbreaks, environmental changes, and social movements. This could be achieved by digitally collecting tweets that contain geo-tagged data with GPS coordinates. However, the percentage of geo-tagged data is extremely small comparing to non geo-tagged data. On the other side, non geo-tagged tweets often contain noisy data such as automated robots that affect the analysis of tweet content. For this matter, it is essential to analyze and understand the differences between geo-tagged and non geo-tagged tweets.;This first part of this research is to compare the context and temporal trends between geo-tagged and non geo-tagged tweets using different keywords "flu" and movie "Ted". Time series analysis has been implemented in order to study internal structure of geo-tagged and non geo-tagged tweet data with "flu" and "Ted" topics. Tweets are collected in four targeted cities (San Diego, Los Angeles, Denver, New York) to represent different geographical areas in the United States.;The second part of this research utilized a methodological framework to filter out the noises by removing retweets and tweets containing URLs from the non geo-tagged tweets. This study has also analyzed the spatio-temporal distribution of geo-tagged tweets within different type of land use to understand the dynamic human activities in urban environments.;The third part of this research adopts social network analysis tools to understand the interaction between Twitter users and to identify the most influential users or online celebrities based on their social network connectivity, degree rankings, and dynamic graphs.;This research has attempted to find the optimal method to filter the non geo-tagged tweets by removing retweets and tweets containing URLs. In general, geo-tagged tweets demonstrate less noises and higher correlation to the event than non geo-tagged tweets. With variation across topics, results showed that filtered non geo-tagged tweets performed better when comparing with geo-tagged tweets in temporal trend, time series analysis and content analysis.;This study has revealed that the keyword choice in Twitter is essential in how strong geo-tagged tweets correlate with non geo-tagged tweets, and how different geo-tagged tweets affect the spatial distribution in various land uses with different keyword choices. Lastly, this study has discovered how communities and social ties form in social network graphs and how can events such as disease outbreaks and entertainments affect the communication between groups within certain time intervals.
Keywords/Search Tags:Non geo-tagged, Twitter, Data, Understand, Content, Flu, Ted
Related items