Font Size: a A A

Research On Trace-Driven Performance Interference Prediction Of Batch Tasks In Cloud Datacenters

Posted on:2022-08-25Degree:MasterType:Thesis
Country:ChinaCandidate:Y D LiangFull Text:PDF
GTID:2518306764494684Subject:Enterprise Economy
Abstract/Summary:PDF Full Text Request
Large-scale datacenters are the engines of cloud computing.Batch job is one major type of datacenter workloads.Generally,a batch job is composed of multiple batch tasks which can be executed in parallel and have the similar resource consuming pattern.As the deployment mode of the batch job shifts from standalone to co-allocation,the resource contentions of co-allocated batch tasks on a physical server lead to the severe performance interference among batch workloads,and hence,the degraded QOS of cloud datacenters.With the datacenters grow in scale and workload diversity,quantifying the batch task performance interference by enumerating all task combinations is infeasible due to its exponentially going-up evaluating cost.Aiming on this issue,a trace-based performance interference prediction method is proposed in this thesis.The method first extracts the inference-oriented task features from large-scale datacenter trace,and then,by adopting machine learning technology,a performance interference prediction model is established for batch tasks in the co-execution environments.The main contributions of this thesis are the followings.(1)Quantitative analysis of performance interference of batch tasks with the largescale datacenter traces.The analysis verifies that the performance interference can be explored via trace data and quantifies the impact of such interference on batch task's execution efficiency.In addition,it can be proved that such performance interference of a batch task can be predicted via the information of its co-allocated task set.(2)A interference-oriented batch task classification method.The method first conducts a statistical analysis of the raw attributes recorded in the datacenter trace.Second,it extracts features from those raw attributes that are related to the resource consuming characteristics and computing behaviors of batch tasks.Third,all tasks recorded in the trace are classified on their execution times.Fourth,for each task group,the Bisecting Kmeans clustering technique is adopted to further group tasks on the selected features.Finally,the merging of similar task groups is conducted to reduce the complexity of interference modelling.(3)A trace-driven performance interference prediction model for batch tasks.First,the feature representation model is proposed to describe the pre-deployed and postdeployed co-allocated task sets.Based on this representation model,a temporal CNNbased prediction model is established.To deal with the co-allocation pattern of batch tasks the established model adopts the relatively large convolution kernels and simple model layers.The loss functions is designed to mitigate the impact of data skew in sample trace data.In all,the established model can predict the performance interference of an individual batch task when its co-allocating tasks varying during its execution.(4)Dedicated performance evaluations.Both of the proposed batch task classification method and the interference prediction model are evaluated in this thesis.Compared to the baseline methods,the proposed task classification method can reduce the decentralization of performance interference of tasks in a single group by average of 42.8%.Compared with three extant typical task interference prediction models,the proposed performance interference prediction model can improve the F1-score value by average of 31.72%.
Keywords/Search Tags:cloud datacenter trace, batch task, performance interference, task classification
PDF Full Text Request
Related items