Font Size: a A A

Research On Feature Selection Method Based On Micro-blog Spammers

Posted on:2018-07-30Degree:MasterType:Thesis
Country:ChinaCandidate:Y R ZhangFull Text:PDF
GTID:2348330569486391Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
Micro-blog has been popular with the public for its timeliness and convenience.Micro-blog brings convenience to people,while its spam problem is inevitable.At present,the spam research of micro-blog mainly concentrates in the spammer detection.How to effectively carry out spammer detection of micro-blog,slove the spam problem and conscientiously protect the interests of social network users is the current research hotspot.However,the high dimensionality and the diversity of user behavior characteristics have brought some difficulties to the spammer detection.In order to reduce the dimension of user behavior characteristics and get the features which are beneficial to the spammer detection,this thesis studies and analyzes the feature selection methods,the main research work is as follows:The high dimensionality of behavior characteristics of micro-blog affects the classifier performance,and the computational complexity becomes higher.In order to reduce these effects,this thesis proposes a new coupling feature selection algorithm called ReF-HS algorithm which is based on relieff and harmony search.Firstly,the relieff algorithm is used as the feature pre filter to choose the features with strong classification ability from the original features for reducing the search space.Then,the key features are obtained by coupling the harmony search algorithm.In order to further improve the speed of the algorithm and ensure the classifier performance,the fitness function of this algorithm is designed by the naive bayes classifier.The experimental results show that the ReF-HS algorithm runs fast,reduces the feature dimension and improves the classifier performance.The diversity of behavior characteristics of micro-blog makes the detection features limited.Spammers use tools to evade existing detection features.In response to the evasion tactics of spammers,this thesis has analyzed that spammers purchase a lot of fans,post more information,mixe the normal information and post heterogeneous information to evade detection.Based on the analysis of these tactics,this thesis proposes new detection features based on graph,neighbor and time.The experimental results show that the stability of new detection features is good,the feature ranking is very high,and these new detection features are helpful for spammer detection.
Keywords/Search Tags:feature selection, evasion tactics, user behavior, spammer detection
PDF Full Text Request
Related items