Font Size: a A A

A Cost-efficient Scheduling Algorithm For Streaming Processing Applications On Storm

Posted on:2022-01-22Degree:MasterType:Thesis
Country:ChinaCandidate:H X DaiFull Text:PDF
GTID:2518306575967059Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Stream processing is a new memory computing paradigm that can efficiently process dynamic data streams,and Storm is the mainstream big data real-time stream processing framework,which is of great significance to the cost-effectiveness research of Storm.This thesis deeply analyzes and studies the system structure and operation principle of Storm,and describes the related research of Storm Task scheduling algorithm and costeffectiveness.Currently,Storm's default scheduler does not consider the cost of cluster,and during the actual Task running,due to the difference of Tasks and the heterogeneity of cluster,it will cause additional cost.To solve the above problems,this thesis proposes a cost-efficient model of stream processing and CE-Storm scheduling algorithm.Experiments show that the proposed method can effectively improve the costeffectiveness.The main work of this thesis is as follows:1.Build the cost-efficient model of stream processing and the heterogeneous platform of CE-Storm scheduling algorithm.It includes improving the architecture of Storm,using monitoring scripts to collect the required two-dimensional resources,running status,communication data and other information.The energy consumption processing module and communication data module are used to process the collected information.2.To optimize the resource overhead,communication overhead and energy consumption overhead of the default Task scheduler during execution,an integrated costefficient model based on stream processing is constructed.Three different types of costs(resource usage cost,energy consumption cost and communication cost)generated during Storm Task scheduling are abstractly modeled to balance the benefits of various cost types and reduce the total cost of the cluster.The model is proved to be effective by comparing different scheduling algorithms.3.This thesis proposes a Storm Task scheduling algorithm(CE-Storm)based on the cost-efficient model of stream processing.Because Storm native scheduler uses polling to place Tasks,it will not only waste resources,but also increase the communication overhead between nodes.The CE-Storm scheduling algorithm proposed in this thesis uses the cost minimization method to allocate resources and Tasks,which can minimize the total cost in the cluster.The experimental results of CE-Storm using Hibench benchmark show that compared with the traditional scheduling strategy,random algorithm and BFD algorithm,the cluster cost is reduced by 19.79%,17.12% and 13.47% respectively.It can be seen that the CE-Storm can effectively improve the cost-efficient of Storm cluster on the premise of meeting the performance constraints.
Keywords/Search Tags:storm, storm cluster, scheduling algorithm, cost-efficient
PDF Full Text Request
Related items