Font Size: a A A

Research On Query Processing Method For Large Scale Graph

Posted on:2020-02-16Degree:DoctorType:Dissertation
Country:ChinaCandidate:W ChenFull Text:PDF
GTID:1368330599459894Subject:Software engineering
Abstract/Summary:PDF Full Text Request
The rapid development of computer network,big data and artificial intelligence has brought opportunities and challenges to data mining technology.In many application fields,data are generating and accumulating at an unprecedented speed,which has resulted in a large number of complex,large scale and dynamic graphs to describe the complex and changeable relationship among data.Management technology for large scale graph has always been a research hotspot in the field of database.And the query and processing based on graph is the most basic and core problem,such as k-edge connected components with maximum connectivity query,subgraph isomorphism query,top-k subgraph isomorphism query and reachability query,which are widely used in molecular chemistry and pharmacology,bioinformatics,computer networks,social networks,collaborative work networks,electronic commerce networks,public transport networks and so on.In this paper,through in-depth analysis and research on the problems of existing methods,the efficient index construction methods are proposed,and the corresponding query algorithms are designed according to the new index,which can improve the query execution efficiency and scalability,be applied to large scale graph.Firstly,the existing k-edge connected components with maximum connectivity query methods suffer from high cost of time and space on constructing index,low query efficiency,and the memory consumption will increase sharply for large scale graph.Therefore,based on k-edge connected components,a new hierarchical index is constructed by utilizing the relationship among the connected components in order to reduce the index construction time and storage space greatly.At the same time,according to the index,the corresponding query methods are designed to improve the query efficiency of k-edge connected components with maximum connectivity.Secondly,in subgraph isomorphism query,due to the redundancy calculation and the excessive amount of intermediate result data in the process of index construction,the query algorithm has high space and time cost and poor scalability.An efficient index construction algorithm based on the neighbor's storage structure of vertices in graph is proposed so as to improve the efficiency of subgraph isomorphism query.In top-k subgraph isomorphism query,aiming at the query inefficiency caused by the problems of redundant enumeration and large memory overhead in the filtering stage,and excessive computation caused by improper verification order in the verifying stage,the preprocessing strategy of lossless compression of graph based on the concept of completely equivalent vertices is proposed,the optimized filtering strategy and the verifying strategy of reasonable matching order are proposed to improve the efficiency and scalability of the algorithm.Finally,in the reachability query processing,the existing methods have the problems that the efficiency is obviously reduced and the performance is unstable when the proportion of reachable queries is higher than that of unreachable queries,therefore,a method is proposed which can quickly construct small-scale index for graph by using the characteristics and properties of graph and its spanning tree,and then the bidirectional decision method is designed,that is,judging a given query from reachability and unreachability,in order to accelerate the response speed of reachability query.Especially,when the proportion of reachability queries increases,the algorithm has better scalability and more stable performance.
Keywords/Search Tags:large scale graph, k-edge connected components, subgraph isomorphism, reachability
PDF Full Text Request
Related items