Font Size: a A A

Detecting The Fake Accounts In Large-scale Social Networks Via Graph Computing

Posted on:2018-09-09Degree:MasterType:Thesis
Country:ChinaCandidate:X Y JiangFull Text:PDF
GTID:2348330515974048Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
Contemporarily,with the development of online social networks,the attackers begin to use the fake accounts to perform the malicious behaviors for their personal purpose.For example,they use the fake account to acquire the user's personal information or to spam the malicious messaging,which threats to the safety to the social network services and their users.Therefore,the detection of fake account in social network plays an important role in network security and information security.Recently,there are many approaches has been proposed to automatically detect the fake account,including the user feature-based,user content-based and social network structure-based approaches.The first two approaches have suffered from the high false positive rates and need large scale filed-visit dataset to training.However,with the g of social networks,the state-of-the-art detect algorithm is difficult to scale-out and be applied practically to the large-scale social network.Moreover,the traditional distributed processing framework for big data,such as Map Reduce,is difficult to deal with unstructured graph data.As a result,a vertex-centric graph computing system specified with large-scale graph data has been proposed.In this Paper,we analyzed the features of social network's graph data and optimized the graph computing system by leveraging the features.Moreover,we analyzed the existing algorithms of the detection,and implemented two different algorithms based on graph structure in our system.Details of our work is as follows:1.At the system level,we analyzed the existing social network graphs.We take both the scale of the social network and the configuration of the single server into consideration and decided that use the single machine graph computing system,which using the out-of-core computing.2.At the graph data level,we took the power-law distribution of the social network into account.we divided the graph into two disjoint sets according to the vertex degree,called heavy set and light set.Then,we stored these two sets with different format,processed with different execution model selective scheduled with different strategy and cached with different strategy to optimize the graph system.3.At the algorithm level,we analyzed the existing algorithm of fake account detection based on graph structure,and the graphs involved are classified into two categories.One is the power-iteration algorithm based on random walk,such as Sybil Rank.and the other one is community-detection algorithm based on graph traversal,such as COLOR/COLOR+.This paper proposed d Sybil Rank and d COLOR algorithm,which will convert two kinds of algorithms mentioned before into vertex-centric parallel iterative graph algorithm to improve the efficiency of the original detection algorithm.4.Finally,we evaluated the performance of the two algorithms on our system.The results show that our system preforms better,for instance,we only need 459 seconds to deal with the network with 50 million vertices on the single server,but Sybil Rank,which need a cluster with 11 m1.large machine,process network with 160 million vertices in 33 hours.Moreover,we also compared our system with the existing graph computing system,it speedup from 1.14 x to 5.91 x on the social network graph.
Keywords/Search Tags:Online Social Networks, Networks Security, Fake Account, Graph Computing, Distributed System, Graph Algorithm
PDF Full Text Request
Related items