Font Size: a A A

Research And Implementation On The Technology Of Weibo Spammers Detecting

Posted on:2014-06-13Degree:MasterType:Thesis
Country:ChinaCandidate:Y ZhaoFull Text:PDF
GTID:2308330479479295Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the opening of microblogging platform, a large variety of purposed spammer emerged. The spammers toke part in false prosperous, participated in network marketing, or pushed some events even rumors and was becoming a trouble of people to enjoy the service. The spammer even has security risks. Therefore, the identification of machine-generated spammer is becoming increasingly important.To facilitate the spam bots detection, several work has been done:Firstly, there are existing difficulties in microblog text processing due to the feature of short text. According to the posting features of spammers, we introduce the technique of copy detection to the analysis of microblogging text. Then we propose a fingerprint-based technology of text re-check.Secondly, the current study of spammer detection lacking of discussion of recognition effectiveness. On this issue, several common spammer identifying characteristics were experimentally analyzed, and the assessment of each feature are given;Thirdly, combined with the analysis of the feature, we divide discriminating features into two part: strong characteristics and weak characteristics. Then we put strong feature based filtering strategy and weak feature based voting strategy into spammer detection. Meanwhile, the paper also gives the threshold strategy of two kinds of discrimination mechanisms. We tested the efficiency of this algorithm. The result showed that the method has relatively high recognition accuracy.Finally, according to the requirements analysis, we proposed and implemented a scalable architecture of spammer identification. Meanwhile, we made the service strategy of spammer recognition.
Keywords/Search Tags:spammer, characteristics discrimination, information fingerprints, strength characteristics, weak characteristics, confidence vote
PDF Full Text Request
Related items