Similarity Top-k Query For Large-scale Dynamic Graph

Posted on:2019-08-11

Degree:Master

Type:Thesis

Country:China

Candidate:X Q Li

Full Text:PDF

GTID:2428330545954777

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

The graph data has property of describing the various objects and their relationships in the real world,so it has been widely used in many fields,such as social network processing,intelligent transportation network and biological molecules.The similar nodes query of some similar nodes has always been a hotspot problem in the field of graphs.The traditional graph node similarity calculation method is usually more efficient on small scale or static graph.It's not only to calculate the similarity between graph nodes,but also to rank and get the Top-k results in order to satisfy the demand.With the popularity of the Internet and the development of big data,the number of graph nodes is increasing with each passing day,and the scale of graph nodes is growing bigger.The data of graph notes update more and more frequently,so it's unable to adapt to large-scale dynamic graph and the processing efficiency is very low.On account of the problems,the paper improves the classical SimRank algorithm in the graph node similar processing area and lucubrates the similar node Top-k query algorithm in large-scale dynamic graphs.Firstly,the thesis differentiates the similar relationship according to the distance of the similarity and defines three similar relationships: direct-similarity,minor-similarity and generic-similarity.Secondly,according to the relationship of n(the number of nodes satisfied the direct-similarity with query node)and k,the corresponding algorithm is used to search for similar nodes respectively.Of which,when the value of k-n is bigger than a certain number,because it may be necessary to evaluate the candidate set that is more than 3-stepsaway from query node,so the thesis select the single-direction graph to improve the query efficiency.When the graph data is dynamic,use the trigger-update method,the thesis determines the degree of correlation between the update and the query.The method of updating timely will be utilized on the condition of it intimate degree of correlation,or the method of updating at regular time will be utilized.The thesis queries in two aspects:similar node query for large-scale graphs and for large-scale dynamic graphs.Themain content of this thesis is as follows:(1)Improving SimRank algorithm.To avoid that the sum of �long distance relationship� is bigger than that of �short distance relationship� due to the superposition in original SimRank algorithm,and make it more practical,the thesis defines three similar relationships: direct-similarity,minor-similarity and generic-similarity according to the distance of the similarity relationships,also presents similarity calculation methods corresponding to three similar relationships.(2)When there is generic-similarity relationship between the nodes and the query node,the thesis uses the method of extracting single-direction graph to improve the query efficiency and reduce data redundancy.(3)Solving the update and maintain problems for node similarity Top-k query in large-scale graphs.The thesis uses the snapshot method to record the changes in the graph over a period of time.The trigger update policy is utilized,that is to update timely when the updated graph is relevant to the query;on the contrary,updating at regular time will be utilized,and after merging the updates,the updated graph is unified.(4)Taking experiments on simulated data sets and real data sets which prove that the algorithm achieved the validity and feasibility of similar node Top-k query in large-scale dynamic graphs with good performance.

Keywords/Search Tags:

PDF Full Text Request

Related items

1	Constraint Top-k Query For Large-scale Dynamic Graph Based On Frequent Subgraph
2	Research On Subgraph Query Method For Large-scale Dynamic And Directed Label Graph
3	Reachability Query Based On Improved Huffman Coding Over Large-scale Dynamic Graph
4	Research On Node Similarity Measurement Method For Large Scale Dynamic Graph
5	Research On Key Techniques Of Query Processing Over Large-scale Graph Data
6	Research On Query Processing Method For Large Scale Graph
7	Research On The Large-scale Evolution Graph Data Partition Method Based On Vertex-Cut
8	Similarity Nodes Query Processing Approach In The Evolution Process Of Large Dynamic Graph
9	Graph Query Algorithms Research Over Dynamic Graph Data
10	Research On Large-scale RDF Data Query Method Based On Graph Clustering