Font Size: a A A

Research And Implementation Of Graph Data Processing Technology Based On Cloud Computing Environment

Posted on:2015-04-26Degree:MasterType:Thesis
Country:ChinaCandidate:R B ZhangFull Text:PDF
GTID:2298330467963458Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the approaching of the information era, all kinds of information are growing explosively. How to process the graph data efficiently becomes a new challenge. This paper is mainly about the research of the key technology of graph data processing:graph data storage, graph data partition and parallel computing engine, and tries to provide a more reasonable and efficient solution. The main research work of this paper can be summarized as follows:This paper uses the HDFS as the storage layer, uses the MapReduce and BSP as the computing engine to provide computing services to application layer. This paper first analyzes the variety of distributed storage model and distributed computing engine. On this basis, this kind of parallel architecture is proposed.Graph data partition is the core technology of graph data storage. Because of the special feature of graph data, there will be a lot of communication in distributed computing. To reduce the communication cost, at the same time ensure load balancing, this paper studies the partition technology. At the same time, two kinds of multi-level graph data partition algorithm based on the BSP model are implemented.This paper implements a variety of graph mining algorithms. Hama is the open source implementation based BSP model. The BSP model is good at dealing with the algorithms with multiple iterations, especially the graph mining algorithms. Based on Hama, four graph mining algorithms have been implemented:PageRank, single-source shortest path (SSSP), K-means, LPA. Experiments show that the platform based on the above four Hama parallel algorithm has good performance.Combining these techniques, this paper designs and implements a Social Network Analysis (SNA) system. The system features include:graph data extraction, graph algorithm (social network algorithm) analysis, the result query and display, etc.
Keywords/Search Tags:cloud computing, graph mining, parallel processing, graphdata partition, BSP, Hama
PDF Full Text Request
Related items