Font Size: a A A

Research On Key Technologies Of Job Allocation And Scheduling In Data Mining Grid

Posted on:2009-06-04Degree:DoctorType:Dissertation
Country:ChinaCandidate:X G ZhaoFull Text:PDF
GTID:1118360245469475Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
DMG (Data Mining Grid) is a special Grid and it integrates Grid techniques with data mining techniques. DMG is used to solve the data mining problems with massive data and high computing requirements. DMG has own characteristics because of the complexity of data mining application and the dynamic and heterogeneous characteristics of Grid. There are many key techniques need further investigation, one of which is Job Allocation and Scheduling. Job allocation and scheduling is used to identify resource requirements, find and allocate resources, schedule and monitor jobs. It aims to make full use of Grid resources and improve the efficiency of the whole Grid system. The process of Job Allocation and Scheduling is composed of three steps: resource discovery, resource allocation and job scheduling. Focusing on the key techniques of job allocation and scheduling in DMG, this paper makes in-depth researches on the techniques of scheduling framework, job modeling, information service, resource allocation, pricing mechanisms and job scheduling.The main works and contributions of this paper are shown as follows:(1) A job model based on a Petri net with changeable structure is proposed.The formal description for the process of Job Allocation and Scheduling is helpful to validate, analyze and optimize these processes. According to the requirements and characteristics of Job Allocation and Scheduling in market-oriented DMG, this paper proposes a job model based on a Petri net with changeable structure and a corresponding scheduling model based on a hierarchical color Petri net. They define the process of Job Allocation and Scheduling strictly and provide methods to validate, analyze and optimize these processes. We validate the liveness and reachability of the scheduling model and the job model by analyzing their reachability trees. A transition tree algorithm is also presented to analyze the cost and time properties of job net, which can be used for the optimization of resource allocation.(2) An information service based on group of interest mechanism is proposed.Resource discovery is necessary for Job Allocation and Scheduling. This paper proposes a novel information service based on Group of Interest (GOI) mechanism, which adopts information backup and neighbor classification mechanisms to provide the reliable and efficient resource discovery for Job Allocation and Scheduling. Experimental results show it suits large-scale and reliability-based Grid.(3) A resource allocation mechanism based on cost-performance ratio is proposed.Resource allocation selects the proper resources for tasks and assigns the proper workload for them. It is an important part of Job Allocation and Scheduling. This paper proposes a resource allocation mechanism based on cost-performance ratio for commerce environments, which can satisfy the requirements of jobs and let resource users have the maximal cost-efficiency. The analysis shows the mechanism can get the maximal cost-efficiency when users use resources and optimize the cost and time of tasks at the same time.(4) A dynamic price mechanism based on demand prediction and task classification is proposed.In the process of Job Allocation and Scheduling, besides considering the benefits of resource users, we need to ensure the benefits of resource providers and balance resource loads. This paper proposes a dynamic price mechanism based on demand prediction and task classification, which adjusts the prices of resources and tasks in advance according to predicted resource loads and Demand Price Elasticity Ratio. Experimental results show the mechanism ensures the benefits of resource providers and promotes load balancing at the same time.
Keywords/Search Tags:Data Mining Grid, Job Allocation and Scheduling, Job Modeling, Information Service, Dynamic Price, Cost-performance Ratio
PDF Full Text Request
Related items