Since the concept of containerization was proposed,the technology of container cloud has been rapidly developed and popularized.Kubernetes,with its automation and self-healing features,has become the de facto standard for cloud-native computing.However,it is worth noting that the original Kubernetes still has the following shortcomings in actual production scenarios due to its universality consideration.Firstly,the granularity of task type differentiation is large,and the ability to describe multiple types of workloads is insufficient,lacking production support.Secondly,the number of managed resource types is limited,and the balanced use of resources such as GPUs and networks is not considered.Thirdly,the scheduling strategy is single,and Kubernetes scores new tasks based on the remaining CPU and memory resources of the current node,which leads to competition for other resources and poor performance.Fourthly,the resource allocation and scheduling scheme based on Request and Limit leads to idle resources being unused when the task resource usage is low.In response to these problems,this thesis conducts the following work based on Kubernetes version 1.24.0:Firstly,for the above-mentioned first point,based on the analysis of the native task description load capacity and production cluster data set,this thesis designs multiple detailed task types based on CRD and Operator to enhance the cloud platform’s ability to describe different types of tasks.For the second and third points,under the support of fine-grained task description capabilities,the BMRA(Balanced Muti-Resource Allocation)scheme is proposed by sharing GPUs and network data set reading latency in managed mode,which comprehensively improves the balanced use of various dimensions of cluster resources.For the fourth point and the lack of static scheduling schemes’ perception of utilization fluctuations,the MTTC(Muti-Task Tide Colocation)scheme is proposed based on multiple task and cluster schemes,which combines cluster resource configuration and uses LSTM algorithm-based usage prediction.Meanwhile,the task types,BMRA scheduling strategy,and MTTC scheduling strategy proposed in this thesis are tested and simulated based on Minikube cluster and simulated nodes.The experimental results show that the task type designed by the platform can enhance Kubernetes’ ability to describe and support multiple tasks,and the BMRA scheduling strategy can comprehensively improve the balance of various dimensions of cluster resource usage.The MTTC strategy enables the mixed deployment of online and offline tasks,which can effectively improve cluster utilization while ensuring the performance of online business. |