Font Size: a A A

Grid File Replication Strategy Based On Distributed File Sharing

Posted on:2012-10-05Degree:MasterType:Thesis
Country:ChinaCandidate:P JiangFull Text:PDF
GTID:2218330332499929Subject:Network and information security
Abstract/Summary:PDF Full Text Request
Grid is a new technical infrastructure, as an important information technology, the purpose of the grid is to integrate the Internet into a giant super-computer, to achieve resource sharing on Internet. Grid provides a transparent and efficient computing capability, the nature of unified information and the application services. The data grid has many characteristics, such as movable, replicable, and cacheable. However, a single storage device is not able to store a complete data set in some cases, therefore we need to separate a huge data set into several pieces and then store them in the devices on multiple nodes. In order to improve user accessing speed, the files need to be transferred to the grid which is closer to the user's position. Moreover, to address the reliability of storage systems, the system also need to back up the data. The concept of data copies is thus raised, and it requests a reasonable arrangement for the generation mode and storage location of data replication.This paper firstly presents the research and analysis of three popular basic copy creation strategies including the best customer model strategy, the waterfall model strategy, and the rapid proliferation strategy. Since these three strategies only consider the file accessing times, and the load on the nodes is not taken into account, a best customer and balancing load strategy is proposed in this paper, based on adapting the experiences of above three strategy. This strategy is an improvement of the waterfall model, and it balances the load on the node and reduces job waiting time, so that it is expected to increase the efficiency of grid job execution.The file access pattern of grid system can be divided into file download mode and file sharing mode. In recent years, researchers focus on the'improvement to the optimization strategies which are based on the file download mode. In file download model, all the required files are transferred to the local, and then the file operations are performed sequentially. On the other hand, file sharing mode accesses the data exists in the network location through directly remote access using file sharing protocol. Compared to file download mode, file sharing mode has two advantages:firstly, it improves the efficiency of job execution through the data transmission and job execution in parallel which reduces user waiting time of job data transmission. Secondly, with file share mode, job execution result can also be written to the storage nodes in a shared manner, so that the user can view the intermediate results of job execution and modify job execution parameters on time. Therefore, based on the proposed new data replication strategy, this paper launched a study on data grid optimization from a new perspective, that is, data replication strategy with file sharing mode. Using such file-sharing mode, the required files can be accessed by execution operations during transmission, which could bring improvement to performance of the overall operation of the grid.The research and improvement of copy creation strategy need experimental and comparison in gird environment, but implementing a large scale grid is a very complex and huge. Fortunately, grid simulator appears to solve this problem. Grid simulator can simulate a wide variety of grid environment. Users can get desired grid environment according to modifying the parameters of adjustment and control. So we can study on the copy strategy in this virtual gird environment. In this paper, OptorSim is used as grid simulator. Before using it, we carry on a deep research and analysis on this simulator, explore its architecture, functions, characteristics and application areas to simulate a grid environment we needed by appropriate adjustments and changes.This paper implements the best customer model strategy, the waterfall model strategy, the rapid proliferation strategy and the best customer and balancing load strategy by adapting the OptorSim simulator. Since the default file access mode of OptorSim simulator is file download mode, we need to recalculate the file transmission time in order to achieve the file sharing mode. The simulation of file sharing access mode is realized by recalculating the job running time through calculation cell extension. Moreover, we need to modify the OptorSim simulator to realize the file sharing mode and implement the copy creation strategy based on it, so that we can carry on test in the simulation of grid environment.This paper compares different copy optimization strategies in the same mode, as well as same copy optimization strategy in different modes. We draw the following conclusions through experimental data:a) In both file download mode and file sharing mode, the performance ranking from high to low is:BestClientLoadBalance model>Waterfall model>Fast diffusion model>Best customer modelb) For a same copy creation strategy, the copy strategy performance in file sharing mode is superior to that in file download mode.According to the above conclusions, this paper proposed a best customer load balancing model in file share mode; it considers both the file access times and the node load conditions. It also introduce data transmission method and job execution method in parallel so as to reduce job data transmission time while improving the grid's performance...
Keywords/Search Tags:Data Grid, OptorSim Network Simulation, Data Replication Strategy, File Access Pattern, Sharing Pattern
PDF Full Text Request
Related items