Font Size: a A A

Spammer Behaviour Analysis And Feature Construction In Multi-relational Social Networks

Posted on:2019-08-20Degree:MasterType:Thesis
Country:ChinaCandidate:J YinFull Text:PDF
GTID:2428330572955299Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of e-commerce platform and social network,users are increasingly using various kinds of information from comments to make decisions.This leads to the emergence of false and fraudulent information.People who publish false comments and fraudulent information are usually called spammers.In recent years,spammer detection has become a hot topic both in academia and industry.Because of the rich data types of e-commerce platform,such as score,comment content,and comment time,the mainstream detection methods rely on the above data to construct the behaviour features,and then use machine learning methods to train classifiers.However,social networks are based on interaction data and are relatively deficient in data types,thus,the spammer detection in social networks requires a detection framework that relies on relational data but is independent of content data.Along this line,the existing research attempts to define complex network features(such as degree,K-Core,Page Rank,connected component,etc.)and interactive sequence features.Nevertheless,the deep semantic information hidden in the multi-relational networks has not been fully utilized.In view of these research issues,the aim of this paper is to fully excavate the deep semantic information hidden in the heterogeneous network,and to define a series of user behaviour features and latent features based on the relational network data.Based on the real multi-relational social network dataset from Tagged.com,this paper mainly carries on the following two aspects:1.Based on the non-content data,we fully excavate the deep semantic information hidden in the heterogeneous network,and define a series of user behaviour features using relational network data.Since the dataset from Tagged.com does not publish the name of each relational attribute publicly,we first analyze the types of relations according to the data characteristics of the data,and then deduce the name of the actual relation type corresponding to each attribute.Secondly,the behaviour patterns of spammers and legitimate users are compared and analyzed in each relation with the consideration of each relation type's meaning.Based on non-content information,such as active time,send/receive ratio,and the proportion of response after sending,a series of feature indicators have been given.Finally,we validate the performance of features with the help of user labels provided in the Tagged.com dataset.2.In this paper,we proposed a “Send-Receive” Role Separable GraphEmbedding Model(RS-GEM)to extract and fuse the hidden features of heterogeneous relations.First,we build a graph in a shared embedding space,where nodes represent for users and edges represent for relations between users.Second,the number of interactions between the sending and receiving users is extracted as interaction vectors.Third,the sending user feature matrix and receiving user feature matrix are constructed,and the user-user interaction vector is represented by dot product.The difference between these two vectors is used to fit the probability matrix decomposition model,and the constraint conditions are added to prevent the overfitting problem in the optimization process.Finally,the hidden features of each user in multirelational social networks are obtained through multi relational mosaic.Cross validation results show that the latent features extracted by RS-GEM contribute significantly in the area of multi-relational social network spammer detection.
Keywords/Search Tags:Spammer Detection, Multi-relational Social Network, Behaviour Analysis, Feature Construction, Latent Features
PDF Full Text Request
Related items