Font Size: a A A

Study On Algorithm Of Parallel I/O Scheduling On Distributed Computing

Posted on:2006-02-07Degree:DoctorType:Dissertation
Country:ChinaCandidate:B Q CengFull Text:PDF
GTID:1118360182968615Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Distributing computing system develops to be an important form on high performance computing increasingly. The research of the theories and technologies of parallel I/O on distributing computing system is a very important topic. In the paper, the new strategies of data files allocated or partitioned and access scheduling of parallel I/O are researched deeply. The new algorithms are proposed. The new strategies include the declustering strategy of large-scale data files and the grouping strategy of a great deal of minitype data files. And the scalability of data store resource is researched. The new algorithms of data access scheduling and multideminsional data partitioned store and lookup are advanced.When the large-scale data file is partitioned into much data blocks, In order to improve the hits on local disks and reduce the delay of the data access, a new strategy named Knowledge Known File Declustering and Access (KKFDA) is proposed. It can guarantee the consistency of way between the file distributing stored and accessed. By applying KKFDA, the system gains the higher hits on local disks, lower traffic in network, and faster responding time to the applicaiton.For the file grouping which is composed of a great deal small files on distributed heterogeneous computing environment, two new corresponding data file assigning strategies are researched when grouping them directly and assigning each file to the disk integrality. One is named the available percent decision-making (APD), and the other is the combination of subsection select and available percent decision-making (CSSAPD). They can improve the efficiency of storing files to disks and satisfy the demand of scalability and high availability.In existing parallel I/O research, some researcheres mainly concentrate on minimizing the disk utilization by balancing thesystem load across all disks, and the minimization of the variance of the service time are neglected basically. In addition to the load balancing and reducing the utilization rate of every disk, a new file assignment strategy of parallel I/O is proposed on cluster computing system. It is named Heuristic File Sorted Assignment algorithm (HFSA). Based on the load balancing, it assigns the files to the same disk according to the similar service time. At last, the service time of the files on the same disk is same or close. The new strategy improves the performance obviously.In order to improve the access rate and efficiency on distributed computing environment, two novel access scheduling strategies are put forward. One is the Adaptive Equi-Partition Scheduling Policy(AEQUI). The other is the Dynamic I/O Scheduling Algorithm of Two Times Scheduling and Self-maintenance Load Balancing(DIOTSMB). When the load balancing is done on the parallel I/O access, the AEQUI not only concentrates the equal partition of the pending I/O requests on the I/O servers, but also concentrates the I/O requests that were assigned to the servers but not processed. So the AEQUI is an efficient new method. The DIOTSMB is proposed based on receiver-driven strategy on distributed computing system and it can reduce efficiently the degree of complexity on the data access scheduling. It can play down the load of I/O data scheduler and depress the bottleneck. So the time of access scheduling is shortened.Aiming at the fact that the efficiency of data retrieve is very low on the all data elements of large-scale multidimensional data set, the related research is done. An effective way to improve performance is to introduce the parallel I/O technology, and to distribute the multidimensional data to multi-disks. In our research, a new multidimensional data cyclic declustering strategy named Heuristic Strategy of Prime number between step value Hi andfinding length M (HSPHM) is proposed, which aims at data retrieve based on scope by extending existing cyclic strategy from using at two-dimension to multidimensional data. The performance of HSPHM is superior not only in parallel degree but also in robust.
Keywords/Search Tags:distributed computing, parallel I/O, data store, data access, scheduling strategy
PDF Full Text Request
Related items