Font Size: a A A

The Performance Optimization To Grid Jobs Based On GSwap

Posted on:2012-12-09Degree:DoctorType:Dissertation
Country:ChinaCandidate:L LinFull Text:PDF
GTID:1118330368478866Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
Grid computing enables the aggregation of distributed resources to carry out large scale scientific calculations and industry simulations. As the precisionrequirement improves, the size of data involved in the job grows violently. The large scale computation is transiting from intensive to data intensive. In distributed system, the data access rate is usually far below the processing rate of CPU, which is constrained by the network and the storage devices. It becomes the performancebottleneck of the system.The bottleneck in grid job execution appears in two ways: (1) the idle waiting of computing sites caused by the data transferring in job submission. In the default way of job submission, file transfer and job execution run in batch mode, which brings down the efficiency of the job execution and the operation factor of the system resource; (2) the middle results are not available during the jobs are running. In many grid applications, the parameters can be adjusted according to the middle resultswithout waiting the accomplishment of the jobs. However, the default job submission scheme only allows the user check the final results. For some time-consuming jobs, it wastes time. Thus, this paper provides GSwap for the aim of the performance optimization to grid jobs. A prototype of the GSwap is implemented in this paper.The monitoring system is important as a low level infrastructure in GSwap. It captures the current status of the system and provides the information for system maintenance and further scheduling. In this paper, the monitoring system is designed and implemented according to the feature of grid. It is compatible to GSwap,comprising of resource monitor, job monitor and file monitor. The resource monitor gains the current information of the resources in the sites and the information of the network bandwidth and latency between sites. The job monitoring gains the status of the job and the properties such as job ID, as well as the information of the job process on computing sites. The file monitoring watches the file access, counts the access times and records the visitor of each access. The monitoring information is pushed todatabase and published by MDS. In the database, all of the raw information is stored, and only the important or statistic information is published by MDS. The record in database can be used in grid performance analysis, while the information published by MDS is an abstract of the data, which can be used in dynamic decision for grid. After the benchmark, the monitoring system is low-cost, scalable, and real-time, as well as robust.GSwap delivers both download and share mode for data access. The popular distributed file systems are compared, and the availability of them in grid is analyzed. As a result, NAS is chosen as the basic file share protocol in GSwap. Then, RLS is used to manage the replicas, aggregating the NAS servers to GSwap. RLS is advanced in this paper to meet the requirement of GSwap. File checksum and replica access monitoring is implemented in this paper. The physical files in NAS are associated with the logical files in RLS, by combining with the file monitoring system. Thus,GSwap can response the file operation in NAS file server automatically. In the design of GSwap prototype, the cost of file replica management is considered. In the process of information update from LRC to RLI, Bloom Filter is used to reduce the network cost, as well as to avoid the unnecessary file update inside the local node, which promotes the efficiency of file replica management.In order to provide a high performance and availability, the file replica strategy is researched. In the research, OptorSim is used to simulate the grid environment and evaluate the replica strategies. This paper extends the interfaces of OptorSim to make it support cascading strategy and fast spread strategy. Based on the structure of OptorSim, the grain of the access is adjusted from file to block by dividing file into several'block files', in order to simulate the file share access of NAS. After analyzing the features and the requiremets of GSwap, a new replica strategy - LTS - is proposed for GSwap file share, on top of Cascading. The LTS is simulated in OpterSim under file share mode. Compared with Cascading, LTS acts almost the equivalent performance with a lower network and storage cost.At last, LTS and Cascading are both integrated in GSwap. The system is tested by a series of jobs. The performance of job execution with no replica is tested as a baseline. The results are as follows: in No Replica mode, the performance of job execution with GSwap is higher than that with GridFTP. In replica mode, the performance of job execution with LTS is higher than Cascading under simulated grid environment. Thus, the performance optimization by GSwap is proved. So is LTS.The research in this paper can be used not only in the data exchange of grid job, but also a grid-based distributed file system, which provides cheap storage services. In the future research, the security and the replica synchronizing strategy should be considered, to provide write operation in GSwap. The job rescheduling can also be implemented by the means of GSwap, using the middle results in it. To gain a further performance promotion, caching and buffering should be applied in GSwap. GSwap can also be used as a data platform in IoT, as the data management platform for Savant.
Keywords/Search Tags:Grid Computing, Job Performance Optimization, Replica Strategy, Distributed File System, Grid Monitoring
PDF Full Text Request
Related items