Research On Method Of Malicious Weibo User Identification

Posted on:2018-12-03

Degree:Master

Type:Thesis

Country:China

Candidate:Z H Li

Full Text:PDF

GTID:2348330512495174

Subject:Electronic and communication engineering

Abstract/Summary:

PDF Full Text Request

With the rapid development of the internet,social networks,such as Twitter and Facebook,have also gotten dramatic progress.Social networks have become inevitable part of modern people's life.In China,Weibo is the most popular social network application.It has already surpassed the pure social contact but instead become an information diffusion center.Meanwhile,it is influencing people's opinions.Therefore,the research against malicious users has an important practical significance.Identification technology for malicious users is an important research hotspot.This thesis aims to make research on the problem of identifying malicious users in Weibo.The work of the thesis is partly supported by the National Natural Science Foundation of China(No.61271308?61172072?61401015)and Academic Discipline and Postgraduate Education Project of Beijing Municipal Commission of Education.The thesis revolves around the following issues:Based on the features of malicious users,this thesis analyzes and discovers the differences between malicious users and normal users by using a function of"collection" which considers Weibo's functional features and users' habits.And then,"collection quantity" and "collection speed" are added into the feature list to verify their contributions to identifying malicious users.The calculation methods of Weka and parameter adjusting are employed in this thesis.To solve the problem of users' information loss,the classification effects before and after processing losing data are compared respectively through three methods including Naive Bayesian,C4.5 Decision-making Tree and Random Forest.Comparing results show that,when data losing exists,both C4.5 Decision-making Tree and Random Forest have good robustness,especially the later.The thesis also simulates the practical condition of how to increase the identifying efficiency in large-scale data.Through the implementation of proposed methods in Hadoop platform,the processing time of data sets in different sizes by different numbers of nodes and the identification effects of malicious users are respectively compared.In summary,the thesis analyzes the difference between malicious users and normal users from the perspective of users' features,based on which suitable classification calculating methods to identify malicious users are chosen.Calculating results indicate that the identification accuracy rate reaches about 90%.

Keywords/Search Tags:

Weibo, malicious users, machine learning, random forest, Hadoop

PDF Full Text Request

Related items

1	Research On Technology Of Malicious Users Identification Based On Weibo Content
2	Weibo Malicious User Detection Method Based On Behavior Feature Analysis
3	Research On Malicious User Identification Of Weibo Based On Machine Learning Classification Algorithms
4	Machine Learning Based Malicious Webpage Analysis
5	Research On ELM Image Classification Combining HOG And Random Forest
6	Detection Of Malicious HTTP Outbound Traffic Based On Random Forest
7	Research On The Random Forest Based Detection Of Malicious Mobile Applications At Runtime
8	Research On Android Malicious Application Detection Based On Machine Learning
9	Visual Interpretation And Analysis Of Random Forest
10	Research Of Random Forest Algorithm Based On Hadoop And Image Classification System Implementation