Font Size: a A A

Research On Dynamic Resource Allocation Mechanism Of Spark Streaming

Posted on:2018-11-22Degree:MasterType:Thesis
Country:ChinaCandidate:B LiuFull Text:PDF
GTID:2428330596954756Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet technology and cloud computing platform,more and more enterprises use Software as a Service(SaaS)platforms to build information services.In the data center of multi-application multi-tenant multi-factor model under SaaS platform,large data applications are complex and diverse,different characteristics of data and computing are contained in them,in this case a single computing mode is difficult to meet the multiple needs of the application.The memory computing engine provided by Spark includes a variety of classic models such as stream processing,batch processing,and large graph processing,so Spark is selected as the mixed data computing component in the SaaS platform.However,with in depth application of Spark Streaming in the SaaS platform,it often faces the challenges of cluster heterogeneity,real-time data flow,dynamic change and massiveness,as well as the challenge of different tenant's requirements in multi-application multi-tenant platform.The existing resource allocation strategies adopted by Spark Streaming have the following problems:(1)the resource scheduling strategy is based on single resource reference factors of the number of remaining CPU cores and homogeneous clusters,it is difficult to ensure low processing delay and efficient use of cluster resources;(2)the resource adjustment strategy lacks of consideration of different tenant needs,can not sufficiently meet the individual needs of tenants,and long adjustment cycle leads to increase the cost of processing.Therefore,in this thesis it analyzes the two aspects of resource scheduling and resource adjustment in Spark Streaming resource allocation process,and puts forward the improvement strategies,and the experimental analysis was carried out in the end.The experimental results show that the improved strategies proposed in this thesis can reduce the delay of stream processing and improve the utilization rate of cluster resources.The research works presented in this thesis are as follows:(1)For the shortage of only considering the number of remaining CPU cores and randomly selecting nodes in the existing Spark Streaming resource scheduling strategy among applications,a resource monitor component Ganglia is added to acquire the dynamic information of nodes in a cluster.After getting the CPU information of each node for a certain period,it calculates the actual CPU utilization rate of each node,and evaluates the node resource level,then schedules resources.(2)For the problem of randomly scheduling the Executors of application to tasks in the task resource scheduling strategy of Spark Streaming,the running task number of Executors and the historical execution time are added as reference factors to evaluate the Executor before assigning Executor to tasks.(3)After analyzing the Spark Streaming dynamic resource adjustment structure model and the multi-tenant demands for resource adjustment according to SaaS platform in the multi-application multi-tenant scenarios,a multi-tenant Spark Streaming dynamic resource adjustment strategy is proposed.The proposed strategy can be defined according to tenant needs so that we can control the process of resource adjustment,meet the needs of different tenants,and reduce the flow processing delay.(4)A Spark cluster is established in the lab to test the proposed strategies.By comparing the improved strategies and the default strategies,the experiments are carried out to analyze and verify the feasibility and superiority of the improved strategy in different application scenarios,Finally,the application of the improved strategies is verified.
Keywords/Search Tags:SaaS, Spark Streaming, Multi-tenant, Dynamic resource scheduling strategy, Dynamic resource adjustment strategy
PDF Full Text Request
Related items