Font Size: a A A

Analysis And Research On MARS Framework Based On GPU Computing

Posted on:2017-02-18Degree:MasterType:Thesis
Country:ChinaCandidate:X Y LvFull Text:PDF
GTID:2348330488450944Subject:Engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet, network resources contain more and more data and information in different forms. People's demand for big data processing is becoming more and more urgent.The performance of server's CPU and I/O throughput are crucial for big data processing. However, for the traditional technical architecture and single computer serial processing model, the storage capacity, fault tolerance and the speed of data accessing and processing are far less than the requirements for the big data processing.The distributed and parallel data processing had been proved as an efficient measure to deals with big data. Several existing parallel processing frameworks, such as Hadoop, Spark, Disco, uses CPU to handle the data. However, as the limited by cores and memory of CPU, the performance must be restricted if we want to parallel compute the huge data by use of the core number of CPU. If the parallel computing can be transplanted to GPU, which can assign large number of threads, the processing speed for the massive amount of data can be effectively improved.MARS is a framework based on GPU. In the framework of the Mars, the data input to the main memory as the form of key/value pairs. During the data processing, the Map tasks and Reduce tasks are initialized with huge GPU threads, and each processing thread is assigned small and same amount of key/value pairs as far as possible so that each of them load balanced. In this way, the processing performance of large amount of data can be optimized, and the efficiency of big data processing can be improved.This thesis focus on the Mars framework, mainly studies its data processing function, analysis the design philosophy and implementation method, and analysis seven kinds of data processing function realized by Mars in detail. The specific research work are as follows:1. Introduce several existing big data processing frameworks domestic and overseas in detail,such as Hadoop?Spark and Disco and compare the advantages and disadvantages of them;2. Give some detail introduction of the architecture and platform of MARS framework,including the GPU, CUDA, Map Reduce et al;3. Analysis the design and implementation of MARS framework, including the design goal,work process, parameter configuration, interface design, and key technology;4. Analysis seven data processing projects implemented by MARS and demonstrate the experiment result, including word frequency statistics, string matching, page views rank and count, matrix multiplication, inverted index and similarity evaluation.
Keywords/Search Tags:big data, parallel data processing, Map/Reduce, GPU computing
PDF Full Text Request
Related items