Font Size: a A A

The Task Slot Model Of Hama Graph Parallel Computing Framework And Its Influence On The Performance Of Job Scheduling

Posted on:2017-03-15Degree:MasterType:Thesis
Country:ChinaCandidate:H J LuFull Text:PDF
GTID:2308330485486472Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Nowadays, Hadoop platform and related technology has penetrated into many areas and which are widely used, and has become synonymous with Big Data. After several years of rapid development, Hadoop platform and related technologies has actually become a standard enterprise big data computing, Map-Reduce its core computing framework in massively parallel processing computing is playing a significant role.Although the performance was so good, but Hadoop also has its shortcomings, specially can not adapt to graph parallel computing.The emergence of the Hama can make up for lack of Hadoop, it not only achieved similar Hadoop parallel computing, and in graph computing is playing its characteristics.However, as Hama is still under development, many features is not perfect, it can not be put into practical application. Meanwhile Hama is a secondary development of parallel computing framework diagram, in practical applications can be designed according to the appropriate job scheduling needs. Two well known as Hadoop job scheduler in the fair scheduler scheduler and capacity, actual production is continuously formed. So we can learn Hadoop Two job scheduler design, in order to meet the practical application, users can design their own Hama graph computing job scheduler.In this thesis, first Hadoop platform and its ecosystem as a brief introduction to understand the Hadoop data processing, which mainly include the distributed file system HDFS and Map-Reduce Framework for Parallel Computations. At the same time,the thesis will focus on three popular Hadoop existing job scheduling algorithms design concept algorithm design herein provide ideas and reference. Next, the thesis will begin the study of Hama by BSP parallel computing model, focused on understanding the principles of its superstep calculation. Combined with its understanding of the function and role of the functional structure of each node, by analyzing the source code of Hama,from a deeper understanding of the job scheduling process and its life cycle.According to the analysis and research, and give full consideration to the user’s actual environment and the demand for different data processing, this thesis designed a Hama-based task slot model and priority job scheduling algorithm, and gives details of the design details. Finally, the programming is completed our design and functional testing, in order to test its performance with the original scheduling also compared.After the test results show that the existing design not only makes up for lack of existingscheduling algorithms, achieving multi-user shared cluster resources and resources are fully utilizedbut also showed a better performance than the original design.
Keywords/Search Tags:Task-slot Model, Multi-level, Job Scheduling, Priority
PDF Full Text Request
Related items