Font Size: a A A

Research On Problem Of Top-K Maximal Clique Enumeration Based On Uncertain Graphs

Posted on:2022-04-24Degree:MasterType:Thesis
Country:ChinaCandidate:H WangFull Text:PDF
GTID:2518306494480954Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Maximal cliques are a type of dense subgraphs.Maximal clique enumeration is used to mine the complete subgraphs that are not included by other cliques from a given graph.The Top-K maximal clique enumeration is used to return the K maximal cliques with largest size,find a set of closely related vertices in applications such as biomedicine and social networks for auxiliary analysis.Compared with deterministic graphs,graphs in practical applications often contain probability information to describe the degree of possibility of incomplete or imprecise data.When the existing method solves Top-K maximal clique on the uncertain graph,it returns the top K maximal cliques with the highest probability.Since the probability of maximal cliques will continue to decrease as the size of the vertice becomes larger,this method will discard the maximal cliques with large vertice sizes and low probability.These discarded maximal cliques have a large scale and have important application value in practice.This paper studies the enumeration of Top-K maximal clique on uncertain graphs.The specific research content is as follows.First of all,this paper redefines the Top-K maximal clique enumeration problem on the uncertain graph,which is used to return the top K maximal cliques with the largest scale whose probability value is greater than a given threshold.On this basis,a basic algorithm Top-KC,which enumerates Top-K maximal cliques on the uncertain graph,is proposed.The algorithm saves the current Top-K maximal clique by maintaining a result set of size K and continuously updates it in the process of enumerating the maximal clique to ensure that the vertice size of the result concentrated clique is as large as possible.Aiming at the redundant calculation problem of the basic algorithm,this paper proposes two optimization strategies to speed up the enumeration of maximal cliques in the Top-KC algorithm.Respectively,sorting and pruning are based on the degree and Core-Number of the vertices,to reduce the number of maximal cliques that need to be enumerated and avoid unnecessary calculations.Secondly,a preprocessing strategy based on the number of triangles on the sides is proposed to simplify the scale of the data graph,thereby reducing the computational cost of the maximal clique enumeration algorithm.The basic idea is to calculate how many triangles each edge can contain before enumerating.If the number is less than 2,it means that the maximal clique size that the two vertices corresponding to the edge can participate in is three vertices,so the edge can be deleted to reduce the scale of the graph.After simplifying the graph based on this strategy,applying the above two optimization strategies can significantly improve the efficiency of solving Top-K maximal clique.Finally,experiments are conducted based on 15 real data sets,and the efficiency of the algorithm proposed in this paper is verified by using the number of enumerated maximal cliques in the Top-K maximal clique solving process and the total calculation time as indicators.Experimental results prove that the degree-based optimization strategy and the Core-Number-based optimization strategy can effectively improve the efficiency of enumerating maximal cliques,and the edge deletion strategy based on the number of triangles on the sides can further reduce the calculation time based on the first two optimizations.
Keywords/Search Tags:uncertain graph, Top-K maximal clique enumeration, size of vertices
PDF Full Text Request
Related items