Font Size: a A A

Research On Grid Resources' Clustering Methods

Posted on:2009-02-28Degree:MasterType:Thesis
Country:ChinaCandidate:S L JinFull Text:PDF
GTID:2178360242480842Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
Grid which is a new infrastructure will completely change people's concept on computing. The Grid computing connects geographically scattered, heterogeneous and high-performance computing media, storage, data and other special resources via Internet, in order to dynamically gain the optimized resources combination and accomplish high-performance united computation.With Grid technology's trend towards standard, integration and large scale, more and more Virtual Organization and their resources join the Grid, that enhances the utility of available resources, but challenge allocating and scheduling the Grid resources. So far, many publications of Grid resources'allocation have been based on pretreatment. Clustering methods, which have been used in the pretreatment, can divide the whole set of resources properly into small subsets, in order to select among them and then schedule on them more efficiently.However, recent reference do not lucubrate on the methods of pretreatment, but just point out the necessity of it or just assume some method's validity before allocating and scheduling the treated-resources, such as one applies the dynamic clustering algorithm based on fuzzy equivalence relation. This dynamic algorithm sets sectional setsαto meet different users'requests, but does not update the clustering results according to the update of resources'information.Research on Grid resources'clustering methods is after the investigation of resources'characteristics: geographically scattered location, logically sharing, dynamically changing and heterogeneity, of which the latter two is considered into the research. Grid is always in the state of changing because of its definition, which means Grid resources and their performance also alters dynamically. That is why we call Grid dynamic. The heterogeneity of Grid resources refers to their various structures, functions and so forth. Generally speaking, the clustering methods could be classified into four categories: Partitioning Methods (PM), Hierarchical Methods (HM), Density-Based Methods (DBM) and Grid-Based Methods (GBM). The major algorithms in PM are k-means algorithm and k-medoids algorithm, whose best advantage is their easy computation. However, this kind of heuristic clustering algorithms can be just applied into discovering the global optimum, but not into clustering the large data set or arbitrary shape. Most algorithms of HM are nonreversible, that is, if a batch of objects are merged or divided, the next step will be based on current clusters. Thus, this category cannot adapt to Grid resources'dynamical locating and performance's dynamical updating. DBM represents similarity by the distance between two points, could cluster arbitrary shape, in which a cluster is defined as a maximal set of density-connected points, and is good at handling noise. In sum, this kind of algorithms could be applied into heterogeneous Grid resources'clustering. DBSCAN is the typical algorithm of DBM. GBM is famous for its clustering speed, which is what the various, heterogeneous and dynamic Grid resources'clustering needs. STING algorithm is the typical GBM.Although both DBM and GBM could be applied into Grid resources'clustering, an algorithm integrating the advantages of DBM and GBM will be more adaptable for Grid resources'clustering.A Density-Based and Grid-Based comprehensive algorithm—CLONE algorithm defines two core concepts: Neighboring cell and the i th-order Neighbors. This algorithm does not have to compute the distance between two points like what the DBM, but just compare the corresponding performance of two points to gain the similarity. CLONE algorithm defines Neighboring cell to illustrate the parameter values in the same scope represented by int erval are equivalent. The algorithm also defines the i th-order Neighbors to set i by the number of equivalent performance parameters, when clustering resources with some rule. Therefore, because the algorithm has no requirement on clustering objects and more efficient execution, it can be applied into heterogeneous and numerous Grid resources'clustering. Nevertheless, CLONE algorithm couldn't tackle dynamic clustering of Grid resources properly. CLONE algorithm is modified in the aspect of dynamical characteristic, and then DCLONE algorithm is gained. This algorithm inherits the two core concepts of CLONE algorithm—neighboring cell and the i th-order Neighbors, processes the Grid resources'dynamic information to produce a mdfarray sparse matrix and then modify the clustering results matrix according to the mdfarray . Thus, the DCLONE algorithm could be applied into the Grid resources'dynamic clustering better, with the same time complexity to CLONE algorithm.Two comparative experiments and two performance ones are designed to prove CLONE algorithm and DCLONE algorithm's performance on clustering grid resources. The former ones compare the other two typical algorithms with CLONE, testing their clustering time by increasing resources'number and performance parameter. As a result, CLONE algorithm suits to cluster heterogeneous Grid resources. The latter ones aim at testing DCLONE algorithm'stability, drawing a conclusion that DCLONE algorithm adapts to cluster Grid resources dynamically.To implement DCLONE algorithm's application in Grid, a DCLONE Grid resources clustering system based on Grid Information Service System MDS is designed in this paper. The system applies DCLONE algorithm and implements two modules that are conversion module and clustering one. Conversion module converts the resource information into the input file of matrix which the clustering one needs, and then reverses the clustering result matrix which is output by clustering one into resource information and returns to MDS. Clustering module'main function is to run DCLONE clustering algorithm to gain resource clustering results.The future work is to improve and modify the existed prototype system. For example, if the predicted resources could be clustered, it will be more efficient to finish allocating and dispatching the resources. Besides, the clustering system should be expanded, adding the scheduling module into DCLONE Grid resources clustering system to implement efficient allocating and scheduling base on resources'clustering preprocess. There is a bold attempt in the research on Grid resources clustering method that the application of a dynamic clustering algorithm--DCLONE algorithm into Grid resources'clustering is put forward for the first time. With the continuous development and improvement of grid technology and clustering method, the research on Grid resources'clustering methods will go further and further.
Keywords/Search Tags:Resources'
PDF Full Text Request
Related items