With the development of Semantic Web and the movement of Linked Open Data,the RDF data released on the Internet has already reached ten billions triples scale,besides, it also presents the geometric growth trend. How to effectively managementand query these data, is more and more important in nowadays. Traditional methodsbased on stand-alone machine to solve SPARQL basic graph pattern could not meetthe requirements of such huge data. And, the method based on the MapReducecomputation model to solve SPARQL basic graph pattern could not give full play todistributed computing potential.For this problem, we come up an algorithm with BSP model to solve SPARQLbasic graph pattern. According to the graph feather of RDF data and SPARQL basicgraph pattern definition, we divide the whole process into two phases:“matchingphase†and “interactive phaseâ€. Based on this algorithm, we design and implement aSPARQL query prototype machine. The engine part of this prototype machine is builton HAMA computation framework, which is a implementation of BSP model. In datapersistence layer, we use the Cassandra database that to meet the fast-read needs ofRDF data. Besides, we also design a query cache part in this prototype machine toimprove the query speed. At last, we conduct an experiment that compare ourapproach based on BSP model to the method based on MapReduce model that MyungJ propose. The results of experiment shows that our method get better query timeperformance.Above all, the method we propose that based on BSP model can make the most ofthe feather of BSP which can send messages between vertexes. For SPARQL basicgraph pattern problem, our method can give full play to distributed computingpotential. Thus, it can support the SPARQL query of the large scale RDF data. |