Font Size: a A A

Research On Subgraph Query Method For Large-scale Dynamic And Directed Label Graph

Posted on:2019-11-17Degree:MasterType:Thesis
Country:ChinaCandidate:L Y JiangFull Text:PDF
GTID:2428330545454773Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the enrichment and expansion of the application field,the figure as one of the commonly used data structure,the real world of many fields use figure to describe the complex and huge logical relationship,such as social networking,biological information network,intelligent network modeling in the field of emerging.At the same time,there are many kinds of vertices in some complex networks,and the label graphs are often used to model such complex networks.Like micro-blog in social networks,we can use the vertex to represent micro-blog users,and the attention between any user can be expressed by a directed edge,when the label can be used to represent the user's gender,location,and focus areas.As a key problem of graph data analysis,subgraph query has attracted wide attention of researchers.Dynamic label graph subgraph query,that is,return structure and tag values are all isomorphic to some sub graphs of query graph.However,subgraph isomorphic query is a NP complete problem.With the increasing and frequent updates of the data graph,the query time will be greatly increased.The key to solve this problem is to improve the query efficiency by creating a high quality index structure.For example,the registration of new users,the cancellation of old users,and the increase or cancellation of concerns between users will cause the dynamic change of label graph in micro-blog.The registration of new users in micro-blog applications and the attention of users will be abstracted as the insertion of the vertex or edge of the label graph.The cancellation of the user account in micro-blog,the cancellation of attention,etc.will be abstracted to the vertex or edge of the label graph.The research on the existing subgraph query algorithm shows that the traditional subgraph query algorithms have many problems,such as low query efficiency,large storage cost and neglecting vertex label information,as the scale of graph data is increased and updated frequently.Therefore,this paper mainly studies subgraph query problem of large-scale dynamic directed label graph.The main contents of this paper can be summarized as follows:(1)The dynamic hierarchical sequence index is proposed,that is,the DHS index.The index by using partial order relation of directed acyclic graph vertex topology into hierarchical sequence,at the same time add vertex Numbers in each sequence information,without complex feature extracting relations can be realized rapidly and accurately.(2)Aiming at the index change caused by the dynamic change of the graph,the updating point topology extended index maintenance strategy is proposed.This strategy adopts incremental updating method,can only change from local vertex or edge topology extended to dynamic maintenance of the index,in order to reduce the overall reconstruction index of overhead.(3)A subgraph query method based on DHS index is proposed for the subgraph query of dynamically directed label graphs,for short,it is –SubDBGragh.This method matches the query graph and the hierarchical sequence of the data graph layer by layer to filter the interference of a large number of non-matching items and obtain the candidate sets.The matching of the map is simple,and the number of isomorphic tests can be reduced,so as to the efficiency of the query can be improved.At the same time,a relational matching strategy is proposed to match the edges of candidate sets to get the final query results.(4)Aiming at the distributed processing problem of large-scale SubDBGragh,the establishment of boundary data set and the Local Dynamic Hierarchical Sequence which is the LDHS index are proposed.The cut-off data sets is used to store the information of the cutting point and the edge of the cutting edge.Before the data processing,the cut-off data sets are loaded into each partition,and the data sets are bounded to assist the related partitions when the subgraph query is carried out.The advantages of this method are that there is no need to access the partition data at the same time when the query is executed,only the corresponding query is executed in the corresponding graph partition,and the result connection of the query is merged into the complete result,and the frequent transmission of the data between the partitions in the query.
Keywords/Search Tags:large-scale dynamic labeled graph, subgraph query, topological hierarchical sequence, graph index, distributed environment
PDF Full Text Request
Related items