Font Size: a A A

Research And Implementation On STORM Cloud Computing Scheduling Optimization Method

Posted on:2018-07-01Degree:MasterType:Thesis
Country:ChinaCandidate:W B RenFull Text:PDF
GTID:2348330518475641Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the development of computer and network technology in recent years,the demand for real-time data stream processing as the representative of big data of IoT processing has increased.It makes the Hadoop based on batch data processing inadequate.Real-time data stream processing has become one of the hot topics.Storm,as a typical representative of streaming data and parallel computing framework,has important applications in real-time analysis,online machine learning,continuous computing and distributed remote calling.With the in-depth application of Storm,the defects of scheduling strategy for resources and tasks become more and more prominent.1)It does not take the communication cost as a consideration;2)The built-in equalizer scheduler still causes the load to be uneven;3)It requires to configure parallelism parameters manually;4)and it can not re-scheduling task with the cluster load changing.These will affect the Storm cluster performance.In order to solve these problems,this paper establishes a scheduling model for Storm,and Topology is attributed to the weighted graph to reduce the communication cost and ensure the load balancing.Then we propose a scheduling strategy based on heuristic graph partitioning,and achieve a dynamic scheduler: 1)The performance log is used as the input of the scheduler to realize the dynamic scheduling?dynamic parallel parameter optimization and re-scheduling optimization;2)The static task allocation is realized by the Topology structure analysis.Finally,the communication cost between cluster nodes is reduced;the load between nodes is ensured to be balanced,the data processing delay is reduced,and the cluster throughput is improved.The specific work includes the following four aspects:(1)Builded Storm optimization and performance detection model.We describe the resources and Task mathematically,and build the Storm scheduling model.By analyzing the Storm framework,creating a performance test thread for the Worker,collecting run-time performance data and saving the performance log,we set up the corresponding performance detection model.(2)Proposed a heuristic graph partition algorithm and balance adjustment iteration strategy for Storm model.The Topology structure and the performance log are transformed into the mathematical model of the directed weight graph,and the scheduling problem of Storm is reduced to the problem of graph partition.The simulation results show that the K-Part partitioning algorithm and the iterative optimization method are more effective.(3)Achieved a scheduler based on graph partition.Dynamic scheduling and static allocation optimization are achieved by using graph partitioning algorithm.The parallelism parameter is automatically optimized in dynamic scheduling,and the problem of re-scheduling optimization is solved.Finally,this paper establishes a simulation environment for Storm cluster.By optimizing the scheduler,the performance of the scheduling strategy is verified.At last we summarize the full text,and analyze the remaining problems in this article,and make a prospect for further research.
Keywords/Search Tags:Storm, stream, dynamic scheduling, graph partitioning, re-scheduling
PDF Full Text Request
Related items