Font Size: a A A

Runtime-aware Scheduling In Stream Processing Systems

Posted on:2016-06-28Degree:MasterType:Thesis
Country:ChinaCandidate:Y LiuFull Text:PDF
GTID:2348330479453436Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In recent years, more and more data are real-time processed. Only stream processing systems keep very low latency, the value of data in stream applications is completely used. However, since stream input rates or resource availability fluctuates as the stream applications run, the stream operator tasks have different processing steps and should be redistributed dynamically for speed up processing. Load balancing and scheduling in traditional paralle l and distributed systems do not work in stream processing systems. The current load balancing and scheduling technique in stream processing systems cannot dynamically redistribute the tasks to solving the input rates or resource availability fluctuation problem. Therefore, it is very critical to find a new scheduling technique.For solving the problem described above, we present and implement a runtime-aware scheduling mechanism that aims at decreasing processing latency by redistributing tasks among nodes and taking the features of stream processing into account. First, a performance cost ratio(PCR) method is proposed for detecting the node processing efficiency affected by running environment fluctuation and providing information for scheduling decision at runtime. Second, the scheduling mechanism proposes a based PCR scheduling algorithm, which assigns the amount of computation according to the node current processing capacity and considers how to reduce task migration. Third, the exponent smoothing method is improved for stream processing systems, which is used for predicting the scheduling feasibility. Finally, the scheduling is deployed. The result is that one task runs on its initial node or a faster node. We have implemented a scheduler as an extension to Storm and reused some modules for reducing overhead.We evaluate the scheduler and compare it with the initial scheduling with the same experimental conditions. The result shows that the scheduler can decrease processing latency outstandingly by 29.6% and the latency difference between different nodes by 47.4%. What's more, the same amount of data uses less computational resources.
Keywords/Search Tags:Stream processing, Schedule, Load balancing, Real-time, Runtime
PDF Full Text Request
Related items