Font Size: a A A

Design And Implementation Of The Subsystem Of Regular Expression Matching On Graph Data Search Engine Trinity

Posted on:2016-12-30Degree:MasterType:Thesis
Country:ChinaCandidate:J B HanFull Text:PDF
GTID:2308330503451185Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In many applications of areas of mass graph data, it is necessary to retrieve pairs of vertices with the path between them satisfying some constraint. Use powerful capabilities of regular expression to describe the pattern of a sequence, in this paper, define a special form of graph pattern matching: regular expression query on the large scale graph, in this query, and use regular expression to present the needs of meeting some constraint. Research on the subject will help to find the path to meet certain constraint, provided that the path can be represented by regular expressions.This paper is based on the actual project of the MSRA. A distributed graph engine Trinity, Trinity is a distributed, in-memory, large graph processing engine, underpinned by a strongly-typed RAM store and a general computation engine. The distributed RAM store provides a globally addressable high-performance key-value store over a cluster of machines. Through the RAM store, Trinity enables the fast random data access power over a large distributed data set.This paper includes regular expression processing, graph generator’s study, query optimization algorithm, distributed algorithms implementation. B y analyzing the yacc, regular expression processing is realized, such a form of regular expression after processing adapt to distributed environment, it can be matched easily. Through the study of graph which is powerlow-type, graph generator is implemented. According to the graph generator, it implements a purpose to set up mass artificial data. Through the query optimization model’s construction to achieve a cost-based estimating query optimization method, according to the optimization method, significantly improves the system efficiency. Based on the implementation of the physical operators, Large-scale distributed algorithm in cluster environment is implemented. Experimental data show that regular expression matching on large graph has met the needs of real-time queries.
Keywords/Search Tags:Graph Data, Graph Pattern Matching, Regular Expression, Distributed Algorithm, Query Optimization
PDF Full Text Request
Related items