Font Size: a A A

Research On Generation Method Of Low Cost And High Parallel Query Plan

Posted on:2021-05-25Degree:MasterType:Thesis
Country:ChinaCandidate:Z J KouFull Text:PDF
GTID:2518306104488144Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
With the increase of RDF(Resource Description Framework)data and the richness of SPARQL(SPARQL Protocol and RDF Query Language)query processing scenarios,the generation of query plan is facing great challenges.On the one hand,the existing RDF query processing system is inefficient when generating query plans.On the other hand,because these systems do not estimate the cost accurately when they generate the query plan,and do not consider the parallel technology of query execution,the cost of query execution is high.These factors will lead to the overall performance of the system.In order to solve the problems of existing query plan generation methods,the RDF query processing system LHPP adopts the generation strategy based on cost model to generate low cost and high parallel query plan and accelerate query execution.First of all,the LHPP system proposes the cost model based on path,uses cost estimation algorithm based on weighted stream sampling to estimate the cost of join operation,and combines the statistic index to calculate selectivity of pattern,join operation and join variable.Then,the LHPP system uses different query plan generation strategies for different join types of queries.For star queries,the LHPP system sorts them according to the selectivity of pattern.For multi-join variable queries,multiple query join order with the least cost are generated according to the selectivity,and the join order with the highest degree of parallelism is selected to generate the final query plan.Finally,the LHPP system uses heuristic rules to optimize the query merging order for the simplified query graph.The experimental results show that compared with the query plan generation method based on dynamic planning in RDF-3X,the query plan generation method used in LHPP system is more efficient.Compared with RDF-3X and Triple Bit,LHPP system could process SPARQL queries faster.In addition,BFS(Breadth First Search)and DFS(Depth First Search)are used to replace the query plan used in LHPP system.The results show that the query plan used in LHPP system is more conducive to improving the overall performance of the system.Finally,the test of query plan expansibility shows that the query plan used in LHPP system plays a certain role in improving the parallelism of query execution.
Keywords/Search Tags:cost model, sample, query plan, selectivity, parallelism
PDF Full Text Request
Related items