Font Size: a A A

Towards Adaptive Cache Management For Dataflow Computation With Memory Resource Constraints

Posted on:2021-03-24Degree:MasterType:Thesis
Country:ChinaCandidate:J Y LvFull Text:PDF
GTID:2428330623465012Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The growth of applications in the data parallel system and the growing demand for the increasing growth of the task processing and the analysis of the data are characterized by huge memory pressures in processing large data sets.Insufficient memory and inefficient caching strategies can seriously affect systemic energy.In general,how to achieve efficient cache management is the main measure of performance and memory overhead for data flow tasks that require a large amount of memory computing.So far,there are still urgent problems in the field.In recent years,the appropriate cache management strategy is designed to balance the workload and ease the transmission bottleneck,and reducing the memory resource consumption is one of the key points of memory computing.In the current traditional data processing system,the default traditional caching algorithm does not effectively meet the application characteristics and real-time requirements of the current environment,and it is easy to cause low hit,unnecessary I/O overhead and resource waste,mainly due to these cache algorithms not fully using the data in the data parallel system to rely on semantic information,but simply relying on the traditional consideration of local information such as recent and frequency information.This paper focuses on the research results of parallel data processing cache management based on existing In-memory Computing systems,and designs a new caching algorithm to achieve the goal of compromise between performance and overhead,which is called Non-critical Path Least Reference Count(NLC).The difference between it and the existing algorithms is that one,NLC fully uses the data semantic information that provided by the data parallel system,and applys the global information extracted from the data processing logic to the cache replacement rather than simply applied to resource scheduling as most of the existing work is.Second: NLC borrows from the narrow-dependent pipeline operation thought in spark,and can further use the dynamic changes of critical path information provided by the data dependency semantics.In addition to the implementation sequence tracking analysis and execution results with resources constrained environment,our system can effectively meet the requirements of the parallel system,and NLC can effectively improve the efficiency of parallel execution.Under optimal conditions,the task response time can be reduced by about 19%.
Keywords/Search Tags:In-Memory Computing, Cache Management, Resource Scheduling, LRC, Critical Path
PDF Full Text Request
Related items