Font Size: a A A

Micro-blog Spam Accounts Detection Research

Posted on:2018-09-05Degree:MasterType:Thesis
Country:ChinaCandidate:Y N ChenFull Text:PDF
GTID:2348330518495432Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
In recent years, new social networks such as Twitter and Sina Weibo are constantly growing, and people's lifestyle and daily entertainment have been greatly changed as well. Social networks provide users with platforms for getting or posting various forms of information such as texts, pictures audios and videos at first time, and they occupy an important position in people's daily life. While social networks provide platform for information exchange to users, the proliferation of spam accounts in social networks have caused serious damage to the ecological balance and user experience in social networks. By research, this thesis defines spam accounts as machine-controlled fake accounts which mainly act as 'fake followers',and spam marketing accounts which mainly aim at sending spam ads or other spam information.This thesis takes spam accounts in Sina Weibo as research objects.Based on the analysis of spam accounts' behavioral characteristics, use a series of spam accounts detection methods, and finally achieve spam accounts detection by combining multiple data types with heterogeneous information networks. This thesis completes following works:1. Research on efficient way to crawl and storage Sina Weibo data.2. Analyze and summarize the behavioral characteristics of spam accounts in the current Sina Weibo platform, and perform statistical feature analysis based on former analysis, and achieve spam accounts detection using statistical features extracted from user's profile information and user's micro-blog information.3. Based on micro-blog's characteristic such as short length,irregular ways of using words, and using special symbols, design micro-blog's text preprocessing and word embedding based text representation models, and use text feature selection and classification methods to achieve text based spam accounts detection.4. Research on heterogeneous information networks and related similarity measurements, construct a heterogeneous information network based on Sina Weibo data, propose a reasonable scheme to combine user's profile information, their micro-blog information and their social information together to improve the performance of spam accounts detecting, and the experiments show effectiveness of the method.
Keywords/Search Tags:heterogeneous information networks, spam accounts, social network, classification
PDF Full Text Request
Related items