Font Size: a A A

Study On Bot Identification On Sina Weibo Based On Machine Learning

Posted on:2017-12-04Degree:MasterType:Thesis
Country:ChinaCandidate:J Q TengFull Text:PDF
GTID:2428330623454770Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
In recent years,with the popularity of social networking sites and the development of Internet technology,Sina micro-blog serves as an important form of media and communication platform,which attracts widespread attention.A large number of bot users have emerged by this trend.They abuse system resources,lower platform efficiency,confused the public,especially those malignant who are misleading the public opinion by spreading rumors and false information,they not only have bad influence on the network environment,but also harm the interests of normal people.This paper takes micro-blog bot users as the research object,aiming at exploring the effective way to identify the typical bot users on micro-blog.Data sources are from data warehouse of micro-blog in Sina,effective features are extracted by behavior analysis,then bot users identification model based on these features are trained by machine learning process and evaluation method.The algorithm of the bot users identification model selects the single decision tree C4.5 and combined decision tree random forest algorithm,this model are trained by the machine learning toolkit Weka with 90% off cross validation.Finally,on model validation stage,we compare the classification performance of two different algorithm by confusion matrix index,and adjust the characteristic variable to transform model to further study.The experimental results show that the extracted feature is correct and effective.Besides,identification model in C4.5 and random forest algorithm has a good performance on the bot users identification.Bot users recognition model in this paper proved to be effective after experiment.On the one hand,this study has the practical significance on avoiding fake information and building the harmonious network environment,on the other hand,this research will help the exploration of bot users recognition in other platform.
Keywords/Search Tags:Sina micro-blog, machine learning, bot users, user identification
PDF Full Text Request
Related items