Font Size: a A A

Grid Resources' Selection Based On Application-Preference Based Fuzzy Clustering

Posted on:2010-03-25Degree:DoctorType:Dissertation
Country:ChinaCandidate:D GuoFull Text:PDF
GTID:1118360272996144Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
With the development of grid technology, various grid applications are developed in many areas. Due to different objectives, operator targets and inner structures, these grid applications usually have special requirements for resources performance besides their general ones. For example, computing-intensive applications require highly powerful floating-point and fixed-point capability; data-intensive applications require highly powerful I/O capability; applications whose computation can be decomposed and evenly assigned need highly powerful communication capability between grid nodes. Meanwhile, grid resources are geographically distributed, heterogeneous in nature, owned by different individuals or organizations with their own policies, different access, and dynamically varying loads and availability. These features increase the difficulty of grid resources'selection.To decrease the time complexity and space complexity when allocating resources, it is a good way to classify the available resources into many clusters by using data mining technology (which is based on application preference), and then select a suitable one to support the application execution. Clustering is a major data mining discipline oriented at finding homogenous chunks of data called clusters. Cluster analysis divides data into groups (clusters) such that similar data objects belong to the same cluster and dissimilar data objects to different clusters. In real applications there is very often no sharp boundary between clusters so that fuzzy clustering is often better suited for the data. Membership degrees between zero and one are used in fuzzy clustering instead of crisp assignments of the data to clusters.This paper adopts the approach of classifying the available resources into many clusters with different performance according to application preference and then selecting suitable resources to support the application execution. Application preference is the requirement for resource performances and is defined as a vector with m elements, one element for one performance requirement. It can be obtained by code analysis, previous execution experience and the tracing of application process. Application preference can be defined and redefined easily according to different grids and applications. Resource performance data which match application requirements can be computed out by using some formulas introduced in previous literatures. Then fuzzy clustering technology according to application preference is applied to create resource clusters. A resource selection Algorithm is deigned to evaluate and select resources. This algorithm firstly evaluates every resource cluster'performance according to application preference and then sort them descendingly. Resources then are selected randomly from the first cluster. If there are no enough resources to be selected, resources would be selected from the second cluster, and so on until enough resources are selected. This approach has been used in grid by designing a resource selection framework on the basis of grid information service. To save the resource selection cost, a simple and direct algorithm for computing transitive closure matrix is introduced in fuzzy clustering to compute transitive closure matrix. Considering the resource performance matrix created by our approach is fuzzy similar matrix, the algorithm has been improved by replacing half computing operations with assignment operations so as to reduce the computational complexity. Therefore, the performance of resource allocation can be furtherly enhanced.Besides meeting application requirements, using this approach to classify resources can reduce the cost on choosing resources. Furthermore, by defining application preference vector, grid resources can be divided into clusters according to different criteria. Although clusters created by different application preferences may have some intersection of resources, the load may not centralize on only a few best resources, thus load balance can be improved.Classifying resources according to application preference needs detecting and predicting resources'performances which match application requirements. Focusing on CPU performance, this paper proposes a CPU performance's detecting method and furtherly a predicting method. Tools which use these methods have been implemented and performance data have been used in resources''fuzzy clustering.To detect CPU's performance and acquire the metric which can be used to compare different CPUs'performance directly, a typical benchmark program is run in every grid node to measure CPU performance and returns a metric named WMFLOPS which can be used to compare different CPUs'performance directly. An algorithm, which can choose a moderate computation cost dynamically, is designed to ensure that every test is comprehensive and the added overhead is comparatively small. The measurement data is continually updated and distributed through MDS so that resource allocation and scheduling decision may be made at run time based on deliverable performance. A distributed CPU performance tool named GcpSensor is designed and implemented to achieve this goal. Experiments have proved that this method can indicate the CPU performance sensitively and that users can use the GcpSensor data to compare different CPUs'performance straightforward, thus the accuracy and efficiency for selecting computing resources are increased.In particular, resource allocation and scheduling decisions must be based on predictions of the performance each resource will be able to deliver to an application during a specified time frame. Hence it is necessary to predict CPU performance on the basis of CPU performance's detection. A CPU performance forecasting tool named GcpForecaster is designed to reach this goal. It collects periodic performance measurements of CPUs and generates accurate forecasts with lower space complexity and time complexity by the use of the dynamic exponential smoothing algorithm. The forecasts can be used to compare different CPUs'future performance directly and are made available to users and schedulers at runtime so that they may select computing resources accurately and efficiently.At last the concepts of this paper are summarized and future works are proposed.
Keywords/Search Tags:grid, resource management, resource selection, fuzzy clustering, application preference, CPU performance
PDF Full Text Request
Related items