Font Size: a A A

The Research And Implementation Of Job Management System Based On Cluster Computing Technology

Posted on:2003-09-16Degree:DoctorType:Dissertation
Country:ChinaCandidate:X C TangFull Text:PDF
GTID:1118360092466160Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
There are two import reasons for our Job Management System Based on Cluster Computing Technology (CJobCenter) : Firstly , with the rapid development of personal computer(PC) and workstation as well as progress in network technology , the applications on the basis of network environment have been widely used in the commerce and scientific research . These applications need not only a workstation or PC but also a "virtual supercomputer" which consists of a lot of computers , software management systems and scientific device . The geographically heterogeneous distributed computing resource scattered in the system are connected by fast network for those applications . Secondly , many organizations own hundreds of powerful workstations which are connected by a local-area network(LAN) or by a wide-area net-work(WAN) . It is common practice in such organizations to allocate each of these workstation to a single user , the owner , who exercises full control over the workstation's resource . In fact, the throughput of these frustrated users is limited by the power of their workstation or PC . They often claim that their jobs are waiting while workstations or PCs owned by the more casual and more sporadic users are avail-able(under-load) . Their productivity could be significantly enhanced if they had access to the under-loaded computing capacity . This method are used to solve this wait-while-under-load problem .To users , CJobCenter provided a single virtual computing environment . It is transparent . To system manager , CJobCenter provided a high throughput and performance computing environment.Our research project started from 1998, the goal of which is to develop a practical Job Management System Based on Cluster Computing Technology named CJobCenter . This system have two cluster: workgroup cluster and workload cluster . In the workgroup cluster, high availability was supported . The main works and achievements of the author since 1998 cover the following aspects.1 According to the function of our CJobCenter system , it is composed of seven parts , including interrelated job part, pre-scheduler , resource requirement, tradeoff,resource supply , work-group-cluster resource management and Network Queue System . Every part not only has special function , but also supports each other .2 CJobCenter let you build interrelated job . One interrelated job can nest other interrelated jobs .In order to prevent from deadlock while executing . It is allow to custom job's queues based on job priority , departmental policies or project requirements . The pre-scheduler are used to finely tune the queues for optimum throughput, as well as define priority of users or groups of users .3 The work group cluster model is proposed in the CJobCenter . On the basis of this model , the flexible calendar scheduler and event scheduler are used to submit interrelated jobs . It is helpfully to increase system capacity by using job distributed algorithm according to node load and queue length , thus the phenomena which are some nodes over-load while others under-load can be avoided.4 Among work group clusters , the tradeoff method is used to provide transparent access to all users . All the distributed heterogeneous hardware and software computing resources are presented as a seamless and single system image . CJobCenter has high throughput in LAN environment, as well as WAN environment.5 In the commercial and scientific application , system downtimes result in the very high loss . CJobCenter promises to minimize downtimes by providing an architecture that keeps system running in the event of a single system failure . When one executed host crashes , jobs will automatically restart in another active host. After the crashed host recovered , the migrated jobs will come back . High availability can be achieved , when checkpoint technology is used to resume the losing jobs.The research work of this dissertation was supported by a international corporation .
Keywords/Search Tags:Cluster Technology, Job Management System, Network Queue System, Interrelated Job, Job Distribution, Tradeoff Model, High Throughput, High Availability
PDF Full Text Request
Related items