Parallel Query Processing System On Large-scale RDF Data

Posted on:2015-11-13

Degree:Master

Type:Thesis

Country:China

Candidate:G Yang

Full Text:PDF

GTID:2308330452957200

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

The RDF (Resource Description Framework) data model was proposed for modelingWeb objects as part of developing the semantic web. It has been used in variousapplications, such as Wikipedia, grovement, biology information and so on. Middleweightof RDF datasets is exponentially. Now, the number of RDF datasets has exceeded onebillion triples and continues to grow significantly. Big explosion on the way RDF dataanalysis and processing of existing data presents serious challenges. Therefore, the designof an efficient RDF data query engine becomes an urgently problem people needs to solve.Parallel query processing system on large-scale RDF data (TripleParallel), proposesan efficient level of one billion RDF data processing techniques. This technique is basedon the characteristics of RDF data using RDF graph data structure for data abstraction. Inorder to speed SPARQL query processing statements, TripleParallel parallel processingmodel is based on block granularity. For inquiries planned production, the use ofselectivity estimation methods to determine the degree of each variable and select thequery graph binding patterns. In the block-grained approach, the establishment of aparallel processing model is units of blocks using data extraction and data manipulationseparately ways. And pipeline approach connects the two porcesses. The approachimproves the degree of parallelism while strengthening the overlapping data andcalculations and reduces the overall execution time of the query. In the block internalprocessing, TripleParallel presents a parallel processing join way. For different datamanipulation takes further optimized to improve the processing speed.TripleParallel good performance in the processing of block-grained and blockinternal, which makes the query processing is reduced by25%compared to TripleBit inquery time. On the one hand TripleParallel reduces from planning to execution plangeneration time and increases the compactness of the whole process. On the other hand, ituses a pipelined processing, and froms both block-grained and block internal acceleratedto achieve the degree of load of the processor, to improve the efficiency of concurrentexecution of different size.

Keywords/Search Tags:

block level parallel processing, parallel processing model, load balancing, parallel join algorithms

PDF Full Text Request

Related items

1	Parallel Query Processing Techniques In Parallel Database System PBASE/2
2	Research On Parallel Processing Techniques For Vision-navigation Of Mobile Robot
3	Research On Key Problems Of Parallel Wire-speed Processing In Network Processors
4	Efficient Algorithms For Image Restoration And Parallel Processing
5	Research Of Parallel Computing In Polymorphic Array Prcessor
6	Study On Parallel Processing Technologies Of Photogrammetry Data Based On GPU
7	Parallel Algorithms And Parallel Implementation Of Meshless Numerical Simulation
8	A load-balancing tool for structured multi-block CFD applications applied to a parallel Newton-Krylov algorithm
9	Key Technology Research Of The Meteorological Monitoring Data Packets Parallel Processing
10	Packet Processing Engine Parallel Architecture Design And Implementation On Commodity Multi-core Processor