Font Size: a A A

Research On Fault-Tolerance Divisible Load Scheduling Models And Algorithms

Posted on:2022-09-10Degree:MasterType:Thesis
Country:ChinaCandidate:H WangFull Text:PDF
GTID:2518306605467064Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
Distributed system is the main platform for processing large-scale tasks,which can efficiently handle big data tasks and computationally intensive tasks,and can effectively shorten the time of task completion through parallel computing.With the continuous expansion of the network scale of distributed computing systems,the number of processors participating in task computation has increased greatly.Even if each processor has good robustness,when the number of processors in the distributed computing system increases to more than thousands,the mean time to failure of the processor will also be reduced from hundreds of days to a few hours or less.When the processor involved in the computation fails due to internal errors,network attacks,network paralysis,etc.,the processor can no longer participate in task computation,and the results obtained by the previous installments will also be lost.The task of the processor need to re-allocated to other normally working processors,which will result in the task can't being completed on time,and will also result in wasted computing resources.Therefore,the task scheduling strategy of large-scale distributed systems must not only pursue the efficiency of task processing,but also ensure the effectiveness of task completion when the processor failure is unavoidable.In view of this,this thesis has deeply studied the divisible load scheduling model and its algorithm with fault-tolerance mechanism.The main research results of this thesis are as follows:1.Research on the fault-tolerance divisible load scheduling model and algorithm.Firstly,a fault-tolerance divisible load scheduling model is established based on a distributed computing system.The scheduling process of the model is mainly divided into three parts:the first installment scheduling,internal installment scheduling and the last installment scheduling.Secondly,taking the shortest time to complete the task as the goal,the optimal task allocation plan for each processor in the scheduling process is derived through formulas.Thirdly,in order to obtain the optimal number of installments of the model,we simplify the model to the homogeneous distributed computing system,and obtain the closed-form solution of the optimal number of installments under through theorem proofs and strict formula derivation;based on this solution,we can calculate the upper and lower bounds of the number of installments in the heterogeneous distributed computing system,and then design a heuristic algorithm to solve the optimal number of installments.Finally,several sets of simulation experiments are designed to verify the effectiveness of the proposed model and the effectiveness of the algorithm.The experimental results show that the model proposed in this thesis can ensure the shortest time of task completion regardless of whether there is a processor failure.At the same time,the resource utilization rate of the processor in the distributed computing system is improved.2.Research on the scheduling sequence of the fault-tolerance divisible load scheduling on heterogeneous distributed computing system.For the scheduling model proposed in Chapter Three,in order to further optimize the time of task completion of the model,we next research on the influence of the scheduling sequence of the processors in the model on the time of task completion.Existing studies have adopted the descending order of transmission speed as the scheduling order of the processor,because this order has been proven to be the optimal scheduling order of the single-installment scheduling model.However,we demonstrates through an example that this sequence is not only not the optimal scheduling sequence for the multi-installment scheduling model,but also causes time conflicts in the processor during task execution,which will cause the task to fail to complete the computation in the expected time.In view of this,on the basis of the previous model and algorithm,we propose a multiinstallment scheduling model for divisible loads that considers the scheduling sequence of processors,and design an efficient evolutionary algorithm to solve the optimal scheduling sequence of processors.Finally,through experiments and comparisons with commonly used scheduling sequences in existing studies,the experimental results show that the processor scheduling sequence obtained in this thesis can not only ensure that there is no time conflict in the task scheduling process,but also the time of the task completion is the shortest.
Keywords/Search Tags:Divisible-load Scheduling, Fault Tolerance, Scheduling Sequence, Genetic Algorithm
PDF Full Text Request
Related items