Font Size: a A A

Research Of Memory Pressure In Shared Executors For Distributed Data Processing Systems

Posted on:2018-08-04Degree:MasterType:Thesis
Country:ChinaCandidate:X ZhangFull Text:PDF
GTID:2428330569975164Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
Many popular distributed data processing systems are sped up by in-memory computing models,however,the limited memory space leads to heavy memory pressure.Although a data processing system often works as a batch processing system,many enterprises deploy such a system by sharing the executors for sharing and cooperation.In a shared executor,multiple submitted tasks are launched at the same time and executed in the same context in the resources,comparing with the batch processing mode where the tasks are processed one by one.Therefore,the memory pressure will affect all submitted tasks,including the tasks that only incur the light memory pressure when they are run alone.These tasks that incur the light memory pressure can be stragglers in their jobs.Stragglers will delay the completion of job,and reduce the efficiency of memory and CPUs.In this paper,we propose a scheduler to mitigate the memory pressure in shared executors for distributed data processing systems.We find that the reason why memory pressure arises is because the running tasks produce massive long-living data objects in the limited memory space.Our studies further reveal that the long-living data objects are generated by the API functions that are invoked by the in-memory processing frameworks.Based on these findings,we build four coarse-grained memory usage models for tasks,and propose a method to classify the API functions based on the memory usage rate.The memory usage rate can also be defined to classify the memory pressure of tasks.Further,we design a scheduler called MURS to mitigate the memory pressure.MURS will suspend these tasks with heavy memory pressure,to ensure the execution of other tasks and avoid them to be stragglers,and to prevent the starvation of suspended tasks.MURS can finally reduce the execution time of all tasks,and protect the timeliness of systems.We implement MURS in Spark and conduct the experiments to evaluate the performance of MURS.The results show that when comparing to Spark,MURS can 1)decrease the execution time of the submitted jobs by up to 65.8%,2)mitigate the memory pressure in the shared executors by decreasing the garbage collection time,and 3)avoid the stragglers in the jobs with light memory pressure.
Keywords/Search Tags:In-memory computing, Straggler, Memory pressure, Task scheduler, Shared executors
PDF Full Text Request
Related items