Font Size: a A A

Research On Group Steiner Tree Based Search Algorithms Over Knowleage Graphs

Posted on:2022-08-29Degree:MasterType:Thesis
Country:ChinaCandidate:Y X ShiFull Text:PDF
GTID:2518306725493174Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
A knowledge graph represents a set of entities and their relations.To explore the content of a large and complex KG,a convenient way is keyword-based querying.Traditional methods answer an exploratory keyword query by computing a group Steiner tree,which is a minimum-weight subgraph that connects all the keywords in the query.Group Steiner tree is a NP-hard problem,existing algorithms with provable quality guarantees have prohibitive run time on large graphs.This thesis proposes two approximation algorithms:KeyKG and KeyKG+,and proves the approximation ratio of KeyKG and KeyKG+ are both g-1.The algorithms rely on Hub Labeling,which labels each vertex in a graph with a list of vertices reachable from it to compute distances and shortest paths.This thesis devises two hub labels:a conventional static hub label that uses a new heuristic to improve pruned landmark labeling,and a novel dynamic hub label that inverts and aggregates query-relevant static labels to more efficiently process vertex sets.Experiments show that the approaches can compute a reasonably good approximation of answers in milliseconds on real-world million-scale knowledge graphs.Furthermore,recent studies have suggested improving the semantic cohesiveness of a query answer by minimizing the pairwise semantic distances between the entities in a subgraph,but it remains unclear how to efficiently compute such a subgraph.This thesis formulates it as a quadratic group Steiner tree problem.Group Steiner tree problem is a subproblem of quadratic group Steiner tree problem,so quadratic group Steiner tree problem is also NP-hard.This thesis designs two approximation algorithms for the proposed problem:QO and EO,and proves that the approximation ratio of QO and EO are 2(g-1)2 and(g-1)2|V| respectively.This thesis further presents various heuristics,e.g.,pruning and ranking strategies,to improve their practical performance.Experiments show that the algorithms can improve the cohesiveness of the answers by up to 3-4 times,and compute answers in seconds on million-scale knowledge graphs.
Keywords/Search Tags:knowledge graph, keyword search, group Steiner tree, hub labeling, semantic cohesiveness, quadratic group Steiner tree
PDF Full Text Request
Related items