Font Size: a A A

Research And Implementation Of High Availability Cluster Job Management System

Posted on:2015-01-22Degree:MasterType:Thesis
Country:ChinaCandidate:J ZhangFull Text:PDF
GTID:2308330464466572Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the development of technology and the expansion of business scale, more and more companies use computer to process jobs, while the use of a single high-performance machine has been unable to meet the growing number of businesses, so cluster of a large number of ordinary servers is already common way to solve this problem. cluster which is composed by a large number of PC connected by high-speed network has low cost, high performance characteristic. It provides the formidable batch and parallel computation ability, and is widely used. However, due to differences between the individual, unified resource management, job scheduling, cluster monitoring, cluster load balancing and other issues have become the most important issue to be addressed, which prompted the generation of cluster job management system.To build a cluster job management system meeting our needs, we deeply study the project background. And then do some related research on typical cluster jobs management system and scheduling algorithm at home and abroad, analyzes the advantages and disadvantages of each algorithm. On this basis, this paper focuses on an open source cluster management system Torque, improving its job scheduling algorithms, then build a cluster job management system to support normal and the timing jobs. The main work of this paper are:First, This paper introduces the inheritance of object-oriented, using “job template" and "execution state" to solve similar problems with some different details.Second, use Haproxy + Keepalived to support high availability and load balance. To improve the system’s high availability, avoid single points of failure, cope with high concurrent job requests, the system uses Haproxy + Keepalived and N + 1 redundancy.Third, improved Torque’s job scheduling strategies to improve system performance. This paper focuses on an open source cluster management system Torque, analyzing its important data structure and job scheduling module. Then, we use Torque to build a cluster management system, and presents a new algorithm to improve Torque’s job scheduling algorithms.Fourth, reference cron expression to support the timing jobs.To support the timing jobs, this paper draw on linux cron expression to define timing jobs, then monitor the time and put jobs to Torque when time comes, so this system supports general jobs and timing jobs.Our work effectively improved the efficiency of job scheduling and cluster resource utilization, and can shorten the response time of the user, so it has great practical value.
Keywords/Search Tags:Heterogeneous Cluster, Task Scheduling, Torque, High Availability
PDF Full Text Request
Related items