Font Size: a A A

Research On Community Detection Algorithm Of Attributed Graphs Based On Link Clustering

Posted on:2023-06-23Degree:MasterType:Thesis
Country:ChinaCandidate:Z H TangFull Text:PDF
GTID:2530307118499494Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Nowadays,people are surrounded by different kinds of networks.With the rapid development of information technology,network sizes have been becoming larger and larger,and their topological structures are more complex than ever before.Besides,the rich attribute information associated with network nodes further increases the complexity of networks.This kind of networks is defined as attributed graphs which are characterized by nontrivial topology and rich content.In recent years,a series of community detection algorithms have been proposed by using the topology or content information to detect meaningful communities from a given attributed graph.However,most of algorithms fail to combine both of them,thus resulting in poor performance in community detection.Moreover,few algorithms can detect link communities,which are believed to be able to better explain the behavior of nodes in forming overlapping communities.However,since the number of links in an attributed graph is generally much larger than that of nodes,the computational complexity of detecting link communities,especially for the attributed graph,is greatly increased.In order to solve these two problems,different algorithms have been proposed in this thesis,and the research content is unfolded as follows:(1)In order to consider both the topology and attribute information and reduce the computational complexity of detecting link communities,a new Bayesian probability model is proposed by approximately simulating the generation process of attributed graphs as a finite mixed label set of link communities.Instead of completely generating the topology and attribute information of links,the approximate generation process targets to describe the sketch of an attributed graph.More specifically,this process only considers the type of connection between pairs of links and the number of occurrences of attribute values associated with the nodes of links.As a result,the amount of sketch information generated is far less than all of the information available in a given attributed graph upon links,thus accelerating the generation process and improving performance in terms of efficiency.In order to solve the inference problem of this Bayesian model,an efficient approximation algorithm,namely VBLCD,is developed by following the variational Bayesian inference framework.Extensive experiments have been conducted on five practical datasets,including two citation networks and three social networks,and the performance of VBLCD has been compared with several state-of-the-art algorithms.Experimental results show that the average performance of VBLCD is the best among all algorithms,thus verifying the superiority of VBLCD in the task of community detection.In addition,we further study the scalability of VBLCD.After parallelization,VBLCD significantly improved by more than two orders of magnitude.In this regard,the scalability of VBLCD is also superior for largescale attributed graphs.(2)In order to solve the problem of high computational complexity yielded by the detection of communities upon links,a fast variant of VBLCD,namely VBLCDF,is proposed.In particular,a random uniform sampling strategy is adopted by VBLCDF,and it first selects a certain percentage of links at the beginning of each iteration,and then combines these links and their adjacent links as the training data,which is then used to train VBLCDF during the iteration procedure.A series of experiments have been conducted on the same five attributed graphs.Experimental results show that the average accuracy of VBLCDF algorithm is comparable to that of VBLCD,whereas its scalability has been significantly improved given an appropriate sampling rate.Hence,the superiority of VBLCDF in the detection of link communities has been demonstrated from the perspectives of effectiveness and efficiency.
Keywords/Search Tags:attributed graph, Bayesian probabilistic model, approximate generative process, community detection, clustering
PDF Full Text Request
Related items