Font Size: a A A

Research And Improvement Of Job Scheduling Algorithms In Hadoop Platform

Posted on:2011-11-08Degree:MasterType:Thesis
Country:ChinaCandidate:Y XiaFull Text:PDF
GTID:2178360308963862Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Recently, cloud computing has achieving tremendous development promoted by industry field and academy, and more and more cloud computing systems have been put into service. Most of these cloud computing systems use Hadoop platform to develop and debug program. Hadoop platform is an open source code frame for very large data process application, and its best advantage is that it realizes transparent parallel processing to application developers, which can let application developers develop cloud computing applications the same way as they develop common application while at the same time Hadoop platform's in charge of dealing with the parallel process. However, there are still many points to be improved since Hadoop is a young platform.Job scheduling technology is one of Hadoop platform's key technologies, and its main function is to control a job's execute sequence and computing resources' distribution, which directly relates to Hadoop platform's overall performance and system resources' usage. However, at present this technology is still in fundamental phrase and existing job scheduling algorithms all have some defects, so through the research of existing job scheduling algorithms, we can find ways to overcome theses algorithms' defects, which has important meaning on improving Hadoop platform's overall performance and system resources' usage.This thesis researched on the follow issues:Through plenty of literature survey, firstly we researched on Hadoop platform's related background, architecture and hardcore component, and then we dived into Hadoop platform's job scheduling technology. On the foundation of analyzing three existing algorithms', which are FIFO scheduler, Fair scheduler and Capacity scheduler, generated background, algorithm thought, design, implementation and defects, we creatively proposed our Naive Bayes Classfier Based Job Scheduler, whose goal, algorithm flow, main components, key points of design and key implementation were introduced in detail.Finally, we implemented this algorithm and did some experiments. The results showed that our algorithm have reached its goal of resolving existing algorithms' defects and having excellent performance.
Keywords/Search Tags:Hadoop, Job Scheduling, Naive Bayes Classifier
PDF Full Text Request
Related items