Font Size: a A A

Model And Detection On Microblogging Spammer Behavior Based On Microblogging Data

Posted on:2015-03-08Degree:MasterType:Thesis
Country:ChinaCandidate:G C LiFull Text:PDF
GTID:2298330467963855Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In the last two years, the Chinese OSNs (Online Social Networks), especially microblog, have been well developed. For example, there are more than100,000,000active users even more registered users in Sina Weibo. With the development of the microblog platform, there have been a large number of spammers in it. This paper focuses on spammer detection problem in Sina Weibo. Based on the results from data analysis in real-world spammer data, this paper proposes a spammer detection model SDM based on duplicate microblog post behavior and LDA topic model. We propose a series of experiments using SDM to detect spammers. The experiments show the effectiveness of the model and algorithm. The main work is as follows:1. Designed and implement a parallel microblogging crawler. We analysis a large number of actual spammer data crawled from Weibo Report hall. We show the different characteristics of different Weibo spammers, as the foundation and basis of spammer detection model.2. Based on the behavioral characteristics of spam users (duplicate microblog post behavior), a spammer detection model SDM is proposed. SDM mainly considers two aspects, user behavior information and weibo content information, SDM provides the behavior evaluation function F (U).3. The paper proposes a series of experiments based on SDM. The experiment results show the effectiveness, effects of parameters and effects of different weibo information.4. Effect of spammer detection based on SDM is greatly influenced by parameters, and SDM is not well adapted to other types of spammers. For this problem, we make F(U) one feature of classification algorithms such as SVM. Experiments show that one more comprehensive set of features lead to better effect.
Keywords/Search Tags:Weibo Spammer, Duplicate microblog post, Topic model, Machine-Learning
PDF Full Text Request
Related items