Pattern Match On The Large Graph Databases

Posted on:2011-07-13

Degree:Master

Type:Thesis

Country:China

Candidate:C X Jie

Full Text:PDF

GTID:2178360305992568

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

The data management problem nearly exists in all the fields of human activity. In the information age, it is very important for all the people to manage large amount of data resource by computer. Relational database has gained great success in many fields, but nowadays, the complexity of many kinds of data has been beyond the processing capacity of relational database. Graph data is one of such complex data types. It has very strong expressing ability and can express the semantics of many real things very well, such as social networks, XML documents, biochemistry molecules, protein-protein networks, and so on. So, query and management of graph data has attracted many researchers' attention.The problem of pattern matching in graph database requires that, given a query graph, all the appearance locations in the database should be found. It is one of the key problems in graph database and also one of the most useful problems. Some researchers and developers has provides several solutions to this problem. But, most of them either focus on the "small graph" databases, that is, supposing that some graphs in database can be completely loaded into main memory and the whole matching process can be also executed in memory; or mainly focus on the basic query in large graph database, for the pattern matching, they usually use join operation in relational database to implement it. This paper mainly focuses on the problem of how to execute pattern matching query efficiently in a large, disk-based graph database. The main contributions of this paper are listed below:(1) It provides a large graph database design and implements important parts in it.(2) It provides a novel disk-based pattern matching algorithm. In this paper, the problem of pattern matching is converted into the problem of joining many temporary tables. Pointing to the defects of "breadth first" and "depth first" strategy, this paper suggests a "full forward-empty-backward" strategy. This strategy can complete the matching procedure without writing out any intermediate results to the disk. Some disk-based graph specified problems, such as, the methods of storing and accessing large amount of temporary data, cache strategy, etc, are also deeply discussed in this paper.(3) Some optimization methods are provided for our algorithm framework. The effects of different joining processing orders to the query execution efficiency are discussed in this paper and a measuring method of the goodness of a joining processing order is provided. A method of selecting a better joining order is also provided in this paper. Parallel executing of our algorithm and some off-line optimization methods are discussed in this paper.The experiments on a real dataset show that our algorithm and optimization methods are very effective.

Keywords/Search Tags:

graph database, pattern matching, query optimization

PDF Full Text Request

Related items

1	Research On RDF Graph Query Optimization Method Based On Subgraph Matching
2	Research On The Optional Matching Issue In RDF Queries
3	Design And Implementation Of The Subsystem Of Regular Expression Matching On Graph Data Search Engine Trinity
4	Research On Data Query Processing And Optimization In Distributed Database
5	A Research Of Generation And Optimization Of Query Plan Based On Graph Database
6	Research On Graph Pattern Matching Method Based On Kinship Network
7	Based On Historical Query Relational Database Query Optimization Keyword Research Questions
8	Research And Implementation Of Database Query Optimization Based On Graph Neural Network
9	Hybrid Graph Query And Graph Computing Engine For Distributed Graph Database
10	Research On XML Query Pattern Matching And Document Filtering