Font Size: a A A

Cloud Computing Method Of Reliability Evaluation And Task Scheduling Research

Posted on:2013-03-23Degree:MasterType:Thesis
Country:ChinaCandidate:F TanFull Text:PDF
GTID:2248330374985900Subject:Mechanical Manufacturing and Automation
Abstract/Summary:PDF Full Text Request
In the era of Cloud computing, the low reliability of the Cloud service would makethe Cloud service users frequently experience service failures and thus undermine theirsatisfaction of the corresponding Cloud service provider. In this case, users maypotentially fleet to other Cloud service provider and consequently the prior one maysuffer from a substantial loss of market share. Therefore, as the traditional industrialproduct, service reliability is one of the core parts to compete in the market. On theother hand, as Cloud computing is featured by its functionality to provide variousservices, the Quality of Service (QoS) is also critical for commercial Cloud. In this case,the service reliability of Cloud computing is thus defined as the probability that usersrequest could be finished within the specified time, which is tightly connected with thesystem performence. Meanwhile, in the computer systems, fault tolerance is a commonway to improve system reliability. As the adoption of fault tolerance would negativelyaffect the performance of the entire system, approach of modeling and evaluating theservice reliability in such case is rarely mentioned in existing researchs. Therefore, weconduct detailed research on performance evaluation of cloud service considering faultrecovery. We consider recovery on both processing nodes and communication links andthe precedence constraints of subtasks are also considered. The proposed cloudperformance evaluation models and methods could yield results which are more realistic,and thus are of practical value for related decision-makings in cloud computing.Additionally, job scheduling is also a critical issue in the research of Cloudcomputing. A highly effective job scheduling algorithm would dispatch jobs to the mostsuitable resources in the resource pool and thus could satisfy the QoS constrains such asthe reliability, cost and service time and meanwhile retain the load balance among allresouces in the Cloud. At present, research on the job scheduling mostly assumes thecomputing resources in the Cloud are perfectly reliable, thus the existing job schedulingalgorithms do not take hardware/software failure and recovery in the Cloud into accountand the corresponding impact of uncertainy on system performance, namely, thecomputing resource needs overheads to recover back if it fails during job execution and this process would negatively affect the completion of job, causing the economic loss ofthe Cloud service provider. Therefore, current job scheduling algorithm cannot achivethe optimal in economic term.Based on the aforementioned, we introduce the failure and recovery scenario in theCloud computing entities and propose a Markov decision process based algorithm todeal with the uncertainty introduced by the failures of computing resources andtherefore maximize the long-term economic gains for the Cloud system.
Keywords/Search Tags:Cloud computing, Service reliability, Job scheduling, Reinforcementlearning, Markov decision process
PDF Full Text Request
Related items