Font Size: a A A

A Checkpoint/Restart Mechanism For MPI Application Migration In Grid Environments

Posted on:2010-08-13Degree:MasterType:Thesis
Country:ChinaCandidate:D X LiFull Text:PDF
GTID:2178360272495899Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
Grid computing is a new computing technolodgy shown up at the end of last centry. By combining with several computing resources which belongs to different institutions and from different regions, grid computing could accomplish thost computing tasks need powerful machines. A great spirit of grid computing is to utilize the powerful computing resource when they're free. It provide computing power in the same way as power grid, users only need to connect with che power network instead of owning a power generator. Grid computing contains a Metascheduler and a local scheduler. Therefore jobs'request submited from a client tools may queue twice. Further more, Local scheduler may give jobs form local area a higher priority.GlobusToolkits is a seriers of infrastructure provided by Globus alliance for users to utilize the grid resource. It contains every things when you will need to access the heterogeneous, distributed computing resource, including security, resource management, and data chansmission, and so on. The Argonne National Laboratory implement a message passing interface ,called MPICH-G2 which supports grid environment. It accords with MPI-1 specification, and it offer the same interface with other non-grid MPI implementations, this may be convenience for scientist transplant their MPI program into grid environment.At present, Grid computing is attracting more and more attentions from scientist because its powerful computing ability. However, due to the scale of calculation increasing, more nodes are required, therefore the probability of fault because a single node increased, so the fault-tolerant of parallel computing becomes more important. Also as we have mention above, the policy of local scheduler make job migration and load balancing more important.Checkpointing is a key technology for load balancing and fault-tolerant. The checkpointing technology also aid for job migrations. Through making checkpoint in a regular or irregular periods, when the system needs to restart because of fault, or migration because of load balancing, system administrator could use the checkpoint file to rollback to the time when job make checkpoint, to avoid computation lost.The main problem of checkpoint a serial process is how to save the executation state of a process and how to optimize the checkpointing process to minimize the checkpoint file's size and the time spent on checkpointing. As for the grid environment, there are some other points to be consider: the first is whether the checkpointing process of a single sub job will cause a global waiting. The second is how to ensure that no message will lost when a sub job make checkpointing and restarting. The third is how to chose the right time to make checkpointing. Commonly there are several different types of parallel program checkpointing algorithms: Synchronous checkpoint algorithm, Asynchronous checkpoint algorithm and Quasi-synchronous checkpoint algorithm. These are been well studied and have some valueable implementations. But it's still to be studied in the grid environment.After analysed some basic problem of parallel job checkpointing, and the specific grid environment, we find out that it should be take the rebuiding topology structure of subjobs into consideration when checkpointing a sub job.This paper design a checkpointing machanism for grid MPI program migration. It is a sub course of project MPI-parallel process migration. A checkpointing machanism for grid mpi program include a main algorithm name Q-SC/R(quasi-synchronous checkpointing/restart). We use the algorithm as below to prevent the inconsistence of message before and after checkpointing:1. client controler send a checkpointing signal to a sub job(or a group sub jobs), make it get in the process of checkpointing;2. the subjob receive the signal. Then it bloadcast the number of the messages it'd sent to others. This will inform other that it will to into the state of invalid;3. other valid subjob receive that message, will waiting for all message that on the transmision; when all message are arrived, it send out an ack message, attaching the number of a sent to the other side; and then close the communication channel use by this parir;4. Interrupt subjob will disconnect the channel to other subjob when all acknowlodgy mesage are arrived. Then it change to the invalid state, and make checkpointing;5. after the subjob checkpointed, the client control resubmit it to a node for executation, the subjob will restart using the checkpoint file, to roll back to the previously state;6. the subjob re-alloc a listen port, and send rebuild message to the client control, bringing the new address and listening port to the client controler;7. client controler gather all addresses infomation of the subjobs which go into checkpoint period, update the client control's JIT;8. client control bloadcast it's latest job infomation table;9. subjob receive the client's message and update their own channel table, and update the global node validation, then the system returns to the normal status.The quasi-synchronous checkpointing/restart algorithem we propose could not only keep the consistency of message; but also enable other subjob keep running instead of global waiting. Futher more, as the design of client control our system not only alow several sub jobs migrations paralleled one time; but also enable several sub jobs migrations cocurrently without deadlock.We analyse the communication process of MPICH1.2.6 and the MPICH-G2's hiberarchy carefully, then we compose the implementation of the algorithm with other necessary functions, add up to MPICH-G2 architecture. We provide a new vitrual device named ckptg2, which is based on globus2 device and contained checkpoint machanism. We named the new MPI implement MPICH-ckpt, which is an checkpoint available message passing interface for grid environment. By nodifying the MPICH1.2.6's implementation, we do not weaken the function of MPICH-G2, but provide a checkpointing and subjob migration machanism.Although this paper desigh a checkpoint/migration machanism. And after restart or migration subjobs still could find out the whole job's topology by special control. But the latency of the message transmition between domain sometimes may be very heavy. And the state anouncing of restarted process are also needed to optimize. Futher more, as the avaliablity of grid resource is incertitude, this may lead to a long time waiting of jobs, therefore we use an virtual job modle to control it.For further research, we would combine the metascheduler—csf and virtual job model to implement parallel job migration's optimize, and provide a loadbalancing and fault tolerant for grid enveriment.
Keywords/Search Tags:Checkpoint, parallel job migration, grid parallel envorinment
PDF Full Text Request
Related items