Font Size: a A A

Research And Implementation Of Lightweight Parallel Computing Model Based On BSP

Posted on:2018-11-19Degree:MasterType:Thesis
Country:ChinaCandidate:Z G LiFull Text:PDF
GTID:2348330518498974Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the arrival of big data era,the widespread popularity of terminal equipment makes that various data collection and storage can be achieved.At the same time,huge volume,high dimensions,complex data types,low value density characteristics have become the main characteristics of the big data[1-2].Although the emergence of big data has brought new opportunities for the relevant industry's data analysis guidance,at the same time its characters of huge amount and high dimension bring new challenges to the data computing.In order to solve this problem,the concept of parallel computing framework has been proposed and promoted rapidly.At present,the mainstream parallel computing framework include Hadoop,Spark,Strom,etc.Each parallel computing framework mentioned above has its own characteristics and application scenarios,but the common characteristics inside the current parallel computing framework require larger computing cluster to support.Moreover,the large-scale parallel computing framework requires more complex configuration and maintenance and the single-node single-thread method adopted by the framework results in insufficient utilization of the corresponding computing resources.So a lightweight computing model,which can be easily maintained and deployed and uses the computing resources sufficiently,has become an immediate need.In order to compensate the limits with the use of large parallel computing frameworks,this thesis presents a lightweight parallel computing model.The main features of the lightweight parallel computing model are easy to implement,easy to maintain,easy to deploy,and fully utilized the computing resources.The underlying design of this model is a two-level parallel architecture combined multi-nodes parallel and single-node parallel.This design effectively improves the utilization of computing cluster resources and solves the problem of limited computing power of single-node parallel.The management module of parallel computing model is on the two level parallel architecture.The management module implements the functions of computing cluster management,task assignment,resource scheduling and result collection.In order to balance the load of the cluster,this thesis proposes a dynamic strategy for the task assignment.In order to deal with the exception in process of parallel computing,this thesis proposes a delay waiting strategy,which contains task calculation timeout and task calculation failure method.And the data management methods in the parallel computing model unifiedly apply data block strategy.According to the local data access characteristics of the algorithm,this strategy is a method of batch loading and batch calculation of data.It realizes the calculation of large amount of data in small memory.At the end of the thesis,three experiments taking a certain feature extraction algorithm for example,are carried out on the parallel computing model.Firstly,the function test is carried out by simulating various environments.The function test includes the task dynamic allocation assignment testing,the main and auxiliary node faults tests and delay wait testing;Secondly,memory consumption under different conditions are tested by adjusting the data block sizes.Last but not least,time performance of the sample algorithm is calculated by using different data sets.By comparing and analyzing the results,the computing effect of parallel computing model in a specific scenario is analyzed.
Keywords/Search Tags:Lightweight, Parallel Computing, Two Level Parallel, Dynamic Allocation
PDF Full Text Request
Related items