Font Size: a A A

Uncovering Collaborative Users On Social Media

Posted on:2016-01-24Degree:MasterType:Thesis
Country:ChinaCandidate:Q Y ZhangFull Text:PDF
GTID:2308330461475759Subject:Software engineering
Abstract/Summary:PDF Full Text Request
The increasing importance of social media attracts more and more users and compa-nies to conduct marketing activities on it. In these activities, social spammers collabora-tively produce a great deal of similar spam information in order to improve the visibility, affecting other users. Collaborative behaviors discovery plays an important role in spam detection, public opinion analysis and SNS marketing, etc. However, since social me-dia is characteristic of enormous data, fast update of message but relatively low density of collaborative behaviors, to effectively and correctly identify collaborative behaviors in social media also has a great research significance.This thesis focus on the problem of collaborative behavior discovery and collabora-tive users detection. Major contributions are listed as follows:1. Classifying collaborative users according to different behavior patterns in conducting marketing and promotion activities. Four kinds of typical collaborative users are de-fined along with descriptions of their behavior characteristics, user profiles and data features. Two detection frameworks are also designed identically for further use.2. An LSH(Locality-sensitive Hashing) based near duplication detection method and it’s MapReduce implemention are proposed in order to identify the massive duplicate mes-sage in social media. Collaborative users filtered out by this method have distinctly dif-ferent user profile,, social network structure and behavior patterns. Empirical studies show its effectiveness in detecting collaborative users.3. A topic model based method is put forward to detecting collaborative users and collab-orative user groups by using their retweeting behavior features. Firstly, each account’s retweet profile(RP) is given by mining relationship between account and message, rela-tionship between account and account. Then, LDA is used for user clustering. Finally, A graph-based semi-supervised learning algorithm with a label propagation procedure is adopted to recognize spammers from non-spammers with few labeled training data. The method is proved to be effective in discovering promoters and skeletons groups.4. The data used in this thesis consist of 2 million users’s tweets generated in the past 5 years. Among which, about 18,000 collaborative users with their tweets and be-havior records are collected, which can provide solid data resources for the following research.To sum up, this thesis gives a study on collaborative behavior, collaborative user and collaborative user groups detection in social media. In a unified framework, two detection methods are proposed. What’s more, empirical studies over a real-life dataset have demonstrated the effectiveness of our methods.
Keywords/Search Tags:Social Media, Collaborative Behavior Discovery, Duplication Detec- tion, Topic Model, Spam Detection
PDF Full Text Request
Related items