Font Size: a A A

Research On Schema Optimization And Subgraph Matching On Property Graph

Posted on:2024-06-26Degree:MasterType:Thesis
Country:ChinaCandidate:H H LiFull Text:PDF
GTID:2530307085987309Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
As an important branch technology of artificial intelligence,knowledge graph describes concepts,entities and their relations in the physical world in symbolic form.Entities are connected to each other through relations to form a networked knowledge structure.Knowledge graph can be widely used in recommender systems,knowledge question answering,social network analysis and other fields.Property graph is a common representation and storage form of knowledge graph,which nodes do not need to create additional nodes with attributes.At the same time,it is easier to traverse,it can be widely applied to data expression in various business scenarios.Property graph schema is a formalized expression of the concepts in the property graph and their relations,which is a logical representation built on the data layer.With the increase of property graph scale and the popularization of applications,it brings abundant information to people,but also brings great challenges to data retrieval.At present,a large number of studies mainly focus on the property graph itself,that is,how to use the property graph data itself and schema constraints to improve query efficiency.In fact,there are rich conceptual and semantic relations in the property graph schema.The quality of the property graph schema directly affects the scale of the property graph data and the query performance on it.However,there are relatively few studies on improving query efficiency on property graphs through property graph schema optimization.In addition,subgraph matching,as a common query problem in property graphs,has been one of the research hotspots in graph data management.At present,in a large number of studies on subgraph matching,the index structure is generated according to the query graph,and the recursive enumeration is performed according to the set matching order to obtain the final matching result.However,the current algorithms generally ignore the influence of the edges between nodes in the query graph on the candidate set,resulting in a large number of redundant operations in the enumeration process,which in turn affects the query efficiency.In order to solve the above problems,the thesis studies the problem of property graph schema optimization and subgraph matching.From the perspective of logic,an optimization method of property graph schema based on combination rules is proposed;from the perspective of data,a subgraph matching method of neighborhood relation label tree coding index is proposed.The main research work and innovations are as follows:(1)Aiming at the problem of property graph schema optimization,the thesis proposes a property graph schema optimization method based on combination rules.By considering the rich semantic relations in the schema,combined with actual queries,the relation with certain characteristics and high recognition degree in the property graph schema is extracted,and the corresponding schema optimization rules is designed on the basis,then apply the proposed rules to the property graph,and optimize the property graph indirectly while ensuring that the original semantic information is not lost,thereby improving query performance.(2)Aiming at the problem of subgraph matching on property graphs,the thesis proposes a subgraph matching method based on neighborhood relation label tree encoding index.First,construct a neighborhood relation label tree coding index that comprehensively considers node information;second,generate a candidate set for each node in the query graph according to the index,construct an auxiliary data structure,and propose pruning rules based on common neighbors and unique candidate nodes to refine auxiliary data structure;then determine the matching order based on candidate nodes;finally propose a dynamic enumeration algorithm based on equivalent nodes to complete subgraph matching.(3)Extensive experiments are conducted on multiple real datasets as well as simulated datasets to compare the proposed method with existing state-of-the-art algorithms.The experimental results verify the correctness and effectiveness of the method proposed in the thesis.
Keywords/Search Tags:Property graph, property graph schema, combination rules, schema optimization, subgraph matching
PDF Full Text Request
Related items