Font Size: a A A

Research On Container-based Elastic Stream Processing System

Posted on:2018-05-20Degree:MasterType:Thesis
Country:ChinaCandidate:X J WangFull Text:PDF
GTID:2428330569475170Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
With the increasing requirements on real time data processing,stream data processing system is widely focused and studied in both academica and industry.Batched stream processing system(BSPS)converts continuous stream processing into a sequence of tiny batch processing jobs and has become a new research hotspot in recent years.However,when deployed in production,BSPS will face up to some problems.Firstly,load fluctuation is common and users often configure the system with maximum resource capacity to deal with the peak traffic.But it is hard to estimate the peak traffic in advance and make the right configuration to guarantee performance.In addition,there will be resource waste when the system is underloaded.Secondly,load imbalance will lead to the existence of straggler nodes which severely impair the performance.The container-based elastic stream processing system effectively solves the above problems.Making full use of fast startup time and flexible resource management of container,it chooses containerized running architecture to achieve better resource scalability than virtual machine and physical machine.The system realizes a proactive elastic resource scheduling mechanism including 1)adaptive cluster scaling,which scales the cluster according to the runtime information,guaranteeing performance when the system is overloaded and reducing resource waste when the system is underloaded.2)requirement-aware resource scheduling,which reschedules CPU resource in the cluster according to the load distritbution when load imbalance exists,avoiding extra cost brought by load balancing based on data repartition.Experiment results show that,compared to original Spark Streaming,the proposed system can effectively scale the cluster in response to load fluctuation and save resource by up to 30%.It also reduces batch processing time when load imbalance exists.
Keywords/Search Tags:stream processing, container, elastic, resource scheduling
PDF Full Text Request
Related items