Font Size: a A A

Analysis And Optimization Of Cloud Platform Performance Bottleneck Based On Machine Learning

Posted on:2021-11-18Degree:MasterType:Thesis
Country:ChinaCandidate:Y J HuaFull Text:PDF
GTID:2518306503997799Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Cloud computing provides customers with a pay-per-use service model.It provides convenient,on-demand,and efficient network service access for enterprises or individuals,allowing customers to use configurable computing resource sharing pools,including network,server,storage,application software,etc.With the current trend of small and medium-sized enterprises continuously expanding their business scale and at the same time hoping to reduce the cost of hardware equipment purchases,cloud computing platforms have gradually become the first-choice target for many companies that require services.Cloud computing platforms are generally composed of a large number of computers that communicate through network protocols,also known as computer clusters.However,clusters usually face the problem of low utilization of computer resources,which will bring high cost losses to cloud service providers and customers.This thesis studies the performance of the cloud platform from two levels.Starting from the analysis of the performance bottleneck of the cloud platform,for the public data set of the cloud computing cluster released by Alibaba,this thesis proposes a method based on machine learning to classify the background tasks of the cluster,including background task characteristics and performance bottlenecks;starting from the task of cloud computing platform task scheduling,the cloud computing resource scheduling mechanism and task scheduling strategy supporting mixed task load are studied,and a mixed load task scheduling strategy based on feedback control is designed,thus the performance of the cloud platform is optimized from the perspective of task scheduling.The main work and contributions of this thesis are as follows:1.This thesis analyzed the existing dataset of cloud computing cluster and found that most of the research stayed at the level of statistical analysis.If the machine learning clustering algorithm is used to study the data of tasks and clusters,the accuracy of task classification and research conclusions can be improved.Then a method for classifying background tasks of Alibaba Cloud computing cluster based on clustering algorithm is proposed.For the public data set of the cloud computing cluster released by Alibaba,the key data of tasks and machines are selected.Based on the clustering algorithm in machine learning,four steps are designed to obtain the final classification results.Finally the validity of the classification algorithm is verified through the knowledge of mathematical statistics and analysis charts.2.This thesis uses the results of task classification based on clustering algorithms,combined with the design of Ali cluster management architecture,analyzes the characteristics of cluster background tasks and performance bottlenecks,and summarizes a series of conclusions with reference significance and research value.Some of these conclusions(such as significantly insufficient cluster memory capacity)can help cloud computing service providers to improve cloud computing performance in a targeted manner.Another part of the conclusion is helpful to deeply study the task scheduling problem in the cloud computing environment,so as to further optimize the performance and efficiency of the cloud platform.3.This thesis aims at the task scheduling problem in the cloud computing environment of mixed task load,studies the cloud computing resource scheduling mechanism supporting mixed task load,and proposes two task priority scheduling strategies,namely,delay sensitive task priority scheduling strategy and throughput sensitive task priority scheduling The strategy implements a high-performance and high-efficiency cloud platform task scheduling solution,ensuring efficient mixed scheduling of different types of tasks.4.This thesis builds a Hadoop cluster,then implements a scheduler based on Apache Hadoop Yarn,and finally designs a real experiment to verify the efficiency and reliability of a cloud computing task scheduling strategy that supports mixed loads.Experiments show that compared with Fair Scheduler,priority scheduling of delaysensitive tasks can reduce the latency by about 62.5 %,while priority scheduling of throughput-sensitive tasks can reduce the latency by about 47.5 %.In addition,the Cloud Sim platform is used to design simulation experiments.Compared with the default scheduling strategy,the priority scheduling strategy of delay-sensitive tasks can reduce the latency by about 74.4 %,and the priority scheduling strategy of throughput-sensitive tasks can reduce tbe latency by about 71.4 %,which proves that using the cloud computing task scheduling strategy proposed in this thesis to schedule tasks is efficient and reliable.Cloud computing has become the scientific and technological foundation of current social development.And how to optimize the performance of cloud computing services to achieve high service quality and high profitability has become an important issue.Therefore,analyzing and researching the performance bottleneck of the cloud platform and exploring its performance optimization methods have extremely high economic benefits,practicality and promotion value.
Keywords/Search Tags:Cloud Platform Performance, Clustering Algorithm, Work-loads Classification, Task Scheduling
PDF Full Text Request
Related items