Font Size: a A A

Subgraph Matching Over Knowledge Graph

Posted on:2022-06-23Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y H SunFull Text:PDF
GTID:1520307040469864Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The problem of subgraph matching is one key issue in graph search,which is an NPComplete problem.With the explosive emergence of general and domain knowledge grah applications,subgraph matching has become a popular research topic in the field of knowledge graph analysis,which has a wide range of applications including question answering,semantic searching and community detection.By exploring the multi-graph and vertex multi-label features of knowledge graph,this thesis clarifies a new challenge brought by these features in the subgraph matching problem,and carries out a study on the subgraph matching problem of knowledge graph.Furthermore,this thesis takes two popular data phenomenas as a foothold,namely dynamic and big data phenomenas,discusses the new problems of subgraph matching over knowledge graph that spawned under the dynamic and big data phenomena,and develops the studies on subgraph matching over dynamic and large-scale knowledge graphs respectively.The main research works are described as follows:1.Study on subgraph matching method on knowledge graph.The multi-graph and vertex multi-label features of knowledge graph exacerbate the complexity of subgraph matching problem.Firstly,the multi-graph structure feature of knowledge graph will lead to heavy candidate calculation costs,the main reason is that the comparison of vertices and edges between query graph and data graph is an inclusive operation,while the comparison of vertices and edges in a general graph is a Boolean operation.Secondly,the vertex multi-label feature of knowledge graph will cause that dissimilar query vertices correspond to a same data vertex,so that the subgraph matching process will produce a lot of redundant candidate calculation and connection calculation of node pairs.In order to resolve the above challenge caused by the multi-graph and vertex multi-label features of knowledge graph,this thesis proposes a subgraph-indexed subgraph matching and optimization method in chapter 2.This method is based on a flow index structure guided by a query vertex order(FGq T index),which redirects the multi-edges between candidate vertices by the matching order relatonships between query vertices,and prunes the redundant vertices and edges in FGq T index using a two-way pruning strategy(top-down construction and bottom-up refinement),thus FGq T index can avoid the heavy and redundant calculations.Based on the FGq T index,this thesis proposes a backtracking-based subgraph matching algorithm FGq T-Match,and two incremental subgraph matching algorithms FGq T-CRS(calculation-reserved subgraph matching algorithm)and FGq T-CBS(connectivity-constraint incremental subgraph matching algorithms).The experimental results show that the proposed subgraph matching method improves the time and space efficiencies,compared with similar methods.2.Study on the continuous subgraph matching method of dynamic knowledge graph.The core idea of most subgraph matching methods still adopts the iterative subgraph search method based on backtracking.However,in the subgraph search process,a large number of wrong candidate vertex pairs are forwardly expanded into the subgraph isomorphic mapping set,which intensifies the computational cost of the backward tracing of wrong vertex pairs.Especially in the continuous subgraph matching process,the inflow and outflow of edges will frequently refresh the candidate state of vertex pairs.Therefore,the undirectional subgraph search starting with refreshed vertices will result in a huge redundant space cost.In order to resolve the problem of undirected subgraph searching,this thesis proposes a subgraph matching method based on the core candidate vertex sequence encoding subgraph in chapter 3.This method is based on a subgraph encoding technique based on core candidate vertex sequence,which constrains the graph update stream to a given candidate area and restricts the search direction of candidate node pairs,so as to solve the problem of undirected subgraph searching.Based on the above subgraph encoding technique,this thesis designs a dynamic knowledge graph index and a state transition model,and proposes an incremental subgraph matching algorithm based on the encoding subgraph.The experimental results show that the proposed continuous subgraph matching method has a better advantage in processing complex query graphs and large-scale data graphs,at the expense of smaller data update and maintenance costs.3.Study on the distributed subgraph matching method of large-scale RDF graphs.Knowledge graph uses RDF data as its original form and model information as a graph.Actually,each RDF represents a specific fact,which is indivisible.For the division of large-scale graph data,the most commonly used strategies are horizontal and vertical division.The horizontal division adopts the minimum cutting edge strategy to divide the irregular data graph.However,this cutting edge strategy will destroy the facts in the fact graph.The vertical division stores the data of the same triple pattern in the form of table.Although vertical division can not destroy factual knowledge,this strategy of storing in the form of table can easily lead to a large amount of repeated storage of data.In order to resolve the problem of indivisible knowledge,this thesis proposes a distributed subgraph matching method based on dominance partitioned RDF graph in chapter 4.This method is based on two concepts,vertex-dominated connected pattern subgraph and dominatance-partitioned pattern hypergraph.Based on the above concepts,this thesis divides a large RDF graph into multiple distributed RDF subgraphs,and designs a distributed subgraph matching algorithm on dominance-partitioned RDF Graph.The experimental results show that the proposed distributed subgraph matching method has a better time performance in dealing with more complex query topological structure.
Keywords/Search Tags:Knowledge Graph, Subgraph Matching, Subgraph Index, Subgraph Encoding, Dominance Partition
PDF Full Text Request
Related items