Font Size: a A A

Clustering Users Based On High-dimensional Fine-grained Features In Social Networks

Posted on:2021-02-25Degree:MasterType:Thesis
Country:ChinaCandidate:X Y SuoFull Text:PDF
GTID:2428330614472563Subject:Cyberspace security
Abstract/Summary:PDF Full Text Request
With the development of Internet,social networks have become a platform for people to acquire information,learn knowledge,share ideas and communicate.Social networks contain a large number of user data.Mining these data and clustering analysis of users can fully understand the user group,which is of great significance to user personalized service,platform content recommendation,commercial precision marketing,network public opinion governance,etc.At present,the research of user clustering analysis is extensive.However,most of the researches are limited to a certain category or several dimensional features,such as using interest,topic and behavior features to build user profiles,using network structure features to detect community structure,and so on.The existing research does not use multiple types of features to cluster social networks users.User representation dimension is low.The comprehensive and accuracy of clustering analysis is weak.Therefore,it is of great significance to conduct user clustering analysis based on high-dimensional fine-grained features to fully understand the user community.In social networks,user features include user attributes features,network structure features and relationship semantic features.According to the difference between feature types,this thesis carries out user clustering analysis from three aspects:(1)This thesis uses attributes features to cluster users,discovers user groups with similar features,and constructs user group portraits.(2)This thesis uses attributes features and network structure features to discover the community,detects the real community structure in the network,and gets the closely connected user groups.(3)This thesis uses attributes features,network structure features and semantic features to generate heterogeneous graph embedding representation,and applies node embedding results to classification and clustering tasks.The main research work of this thesis is as follows:(1)This thesis proposes a user group portrait model based on high-dimensional fine-grained attributes features,extracts 4 categories and 20 subclass user attribute features,including basic features,content features,statistical features and behavior features,comprehensively depicts users from the aspects of interest topics,location preferences,posting habits,emoticons use,etc,obtains user groups through clustering analysis,and constructs a relatively complete and comprehensive user group portraits.Through analysis,we obtain 17 representative user groups,including institutional group,political certified male users,elite certified female users,etc.Compared with theexisting work,the user group portrait model in this thesis involves features with finer granularity,more types and wider range.(2)This thesis proposes a Attribute Feature Enhancement Louvain method(AE-Louvain)which uses node attributes features and network structure features to detect community structure by optimizing modularity increment and node attribute feature similarity.The validity of AE-Louvain method is verified by experiments.At the same time,the experiment shows that,similar to the modularity increment,the similarity between node attributes features can also determine the community division of nodes.(3)This thesis proposes a Heterogeneous Graph Attention Auto-Encoders(HGATE),which considers the attribute features of nodes,network structure features and heterogeneous graph semantic features,stacks encoder/decoder layers and uses hierarchical attention mechanism to realize unsupervised representation learning of heterogeneous graph data.HGATE is suitable for transductive learning and inductive learning.The experiment of node classification on two kinds of heterogeneous graphs datasets and compared with the latest eight kinds of graph representation methods,which shows that HGATE performs better than the most stats-of-art unsupervised graph embedding methods and outperforms or matches the most state-of-art supervised methods.(4)This thesis designs and develops a user portrait display system,which crawls the home page information,blog information and social relationship data of the target microblog users,extracts the user attribute features,clusters users,and visually displays the user's individual portrait and group portrait.
Keywords/Search Tags:Social Networks, Cluster Analysis, User Group Portrait, Community Discovery, Heterogeneous Graph Embedding Representation
PDF Full Text Request
Related items