Font Size: a A A

Information Retrieval And Propagation Analysis In Social Media

Posted on:2014-06-22Degree:DoctorType:Dissertation
Country:ChinaCandidate:Z C LuoFull Text:PDF
GTID:1108330479979585Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Social Media is now ubiquitous on the internet, generating both new possibilities and new challenges in information retrieval and propagation analysis. This thesis focus on finding important information and propagated information analysis in Social Media. We take Twitter as our research subject, since it is one of the most Social Media and public by default, which makes the data less problematic from a privacy standpoint, far easier to obtain and more amenable to target applications (such as information retrieval).The main tasks in information retrieval are finding related objective or subjective documents about some topics in collection. Twitter is rich resource which contains infor-mation about various topics and opinions. Here we investigate how to find these infor-mation in Twitter. However, Twitter retrieval is different from traditional retrieval tasks (e.g, web search), since the text of tweet is short and informal. In this study we exploit textual features of tweet and the social media features to improve Twitter retrieval. Ad-ditionally, information dissemination is a prevalent phenomenon in Twitter and is related to the quality of message (which can help finding high quality information in Twitter). Therefore, from the point of view of tweets and users, we study the factors which affect tweet retweeting and users’retweeting behavior.Our work can be divided into four parts:(1) improving Twitter retrieval by exploiting structural information, (2) opinion retrieval in Twitter, (3) finding propagated opinion in Twitter, (4) finding retweeters in Twitter. We introduce the four work in detail as follows:Improving Twitter retrieval by Exploiting structural information. Twitter re-trieval deals with finding related tweets about some topics in Twitter. Most Twitter search systems generally treat a tweet as a plain text when modeling relevance. However, a series of conventions allows users to tweet in structural ways using combination of dif-ferent blocks of texts. These blocks include plain texts, hashtags, links, mentions, etc. Each block encodes a variety of communicative intent and sequence of these blocks cap-tures changing discourse. Previous work shows that exploiting the structural information can improve the structured document (e.g., web pages) retrieval. In this study we utilize the structure of tweets, induced by these blocks, for Twitter retrieval. A set of features, derived from the blocks of text and their combinations, is used into a learning-to-rank scenario. We show that structuring tweets can achieve state-of-the-art performance. Our approacn uoes not rely upon social meuia Teatures, Dut wnen we uo and this additionai information, performance improves significantly.Opinion retrieval in Twitter. Opinion retrieval deals with finding relevant docu-ments that express either a negative or positive opinion about some topics. Social Net-works such as Twitter, where people routinely post opinions about almost any topic, are rich environments for opinions. However, spam and wildly varying documents makes opinion retrieval within Twitter challenging. Here we demonstrate how we can exploit social and structural textual information of tweets and improve Twitter-based opinion re-trieval. In particular, within a learning-to-rank technique, we explore the question of whether aspects of an author (such as the number of friends they have), information de-rived from the body of tweets and opinionatedness ratings of tweets can improve perfor-mance. Experimental results show that social features can improve retrieval performance. Retrieval using a novel unsupervised opinionatedness feature achieves comparable per-formance with a supervised method using manually tagged Tweets. Topic-related specific structured Tweet sets are shown to help with query-dependent opinion retrieval. Finally, we further verify the effectiveness of our approach for opinion retrieval in re-tagged TREC Tweets2011 corpus.Finding Propagated opinions in Twitter. Twitter has become an important source for people to collect opinions to make decisions. However the amount and the variety of opinions constitute the major challenge to using them effectively. Here we consider the problem of finding propagated opinions-tweets that express an opinion about some topics, but will be retweeted. Within a learning-to-rank framework, we explore a wide spectrum of features, such as retweetability, opinionatedness and textual quality of a tweet. The experimental results show the effectiveness of our features for this task. Moreover the best ranking model with all features can outperform a BM25 baseline and state-of-the-art for Twitter opinion retrieval approach. Finally, we show that our approach equals human performance on this task.Finding retweeters in Twitter. An important aspect of communication in Twitter (and other Social Networks) is message propagation-people creating posts for others to share. Although there has been work on modelling how tweets in Twitter are propagated (retweeted), an untackled problem has been who will retweet a message. Here we consider the task of finding who will retweet a message posted on Twitter. Within a learning-to-rank framework, we explore a wide range of features, such as retweet history, followers status, followers active time and followers interests. We find that followers who retweeted or mentioned the author’s tweets frequently before and have common interests are more likely to be retweeters.Based on the study of four work above, we find the textual information of tweet and social media features in Twitter can help Twitter retrieval and propagation analysis.
Keywords/Search Tags:Twitter, Information Retrieval, Opinion Retrieval, Propagated Opinion, Retweeter
PDF Full Text Request
Related items