Font Size: a A A

Theory And Method For Data Allocation And Task Scheduling On Multicore Systems

Posted on:2017-04-20Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y WangFull Text:PDF
GTID:1368330488977076Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the continuous expansion of scientific research and the advent of cloud computing and big-data technology,the characteristics of high-performance computing has been turned from computing-intensive to data-intensive.High-performance computing faces enormous challenges,and researchers need to collect and process mass data for studying and solving complex scientific problems.Together with the new trend of green computing,how to reduce energy consumption,budget,and cost while an application is executed on multi-processor systems has receive extensive attention and become a research hot spot.In high-performance computing,data assignment and task scheduling is critical to improve performance and reduce energy consumption.Due to the compact relationship between data allocation and task scheduling,the data allocation will affect the position of supported task nodes,while task scheduling will affect the allocation of requirement data.An unreasonable approach of data allocation and task scheduling will cause long memory access time,which will not only delay data transmission in network and affect the whole system performance,but also reduce system throughput and operating efficiency.Furthermore,an unreasonable approach of data allocation and task scheduling will reduce system reliability and cause high energy consumption.Therefore,exploiting an efficient data allocation and task scheduling algorithm has great value and practical significance.This paper focuses on the research of data allocation and task scheduling on multiprocessor systems for real-time applications to minimize the total energy consumption,reduce scheduling length,meet the application's requirements,and hide memory latency,through assigning data to proper processor to and allocating task to each CPU.In this paper,we first study the nested loop scheduling problem of multi-dimensional applications in multi-core.Most scientific and digital signal processing(DSP)applications are recursive or iterative.The execution of these applications on a chip multiprocessor(CMP)encounters two challenges.First,as most of the digital signal processing applications are both computation intensive and data intensive,an inefficient scheduling scheme may generate huge amount of write operation,cost a lot of time,and consume significant amount of energy.Second,because CPU speed has been increased dramatically compared with memory speed,the slowness of memory hinders the overall system performance.In this paper,we develop a Two-Level Partition(TLP)algorithm that can minimize write operation while achieving full parallelism for multi-dimensional DSP applications running on CMPs which employ scratchpad memory(SPM)as on-chip memory(e.g.,the IBM Cell processor).Experiments on DSP benchmarks demonstrate the effective-ness and efficiency of the TLP algorithm,namely,the TLP algorithm can completely hide memory latencies to achieve full parallelism and generate the least amount of write operation to main memory compared with previous approaches.Experimental results show that our proposed algorithm is superior to all known methods,including the list scheduling,rotation scheduling,Partition Scheduling with Prefetching(PSP),and Iterational Retiming with Partitioning(IRP)algorithms.Furthermore,the TLP scheduling algorithm can reduce write operation to main memory by 45.35% and reduce the schedule length by 23.7% on average compared with the IRP scheduling algorithm,the best known algorithm.A modern high-performance computing system normally consists of heterogeneous computing and communication resources,including heterogeneous processors,heterogeneous memories,and heterogeneous communication interconnections.In this paper,we address the problem of energy-aware heterogeneous data allocation and task scheduling on heterogeneous multiprocessor systems for real-time applications.In a heterogeneous distributed shared-memory multiprocessor system,an important problem is how to assign processors to real-time application tasks,allocate data to local memories,and generate an efficient schedule in such a way that a time constraint can be met and the total system energy consumption can be minimized.We propose an optimal approach,i.e.,an integer linear programming method,to solve this problem.As the problem has been conclusively shown to be very computationally complicated,we also present two heuristic algorithms,i.e.,task assignment considering data allocation(TAC-DA)and task ratio greedy scheduling(TRGS),to generate near-optimal solutions for real-time applications in polynomial time.Based on our extensive simulation study,we observe that our algorithms exhibit excellent performance and are superior to greedy algorithm.In this paper,we also address the energy optimization problem of data allocatied and task scheduling incurred by applications execution on CMP with hybrid scratch pad memories(SPMs),where each SPM is composed of a SRAM and a NVM.While an application with data dependencies is executed on the targeted architecture,the following problems must be resolved,i.e.,improving energy consumption,improving the endurance of NVM(reducing the number of write operations on NVM),and shortening scheduling time.In this paper,to solve these problem,we first propose a data allocation algorithm for single core systems to reduce energy consumption while controlling the number of write operations on NVM,then we propose two novel algorithms,i.e.,EADA and BDAEW,for CMP with hybrid SPMs to solve the energy optimization problems of data allocation and task scheduling.According to the simulation and experimental results,our proposed algorithms exhibit excellent performance and are superior to greedy algorithm.Flash memory and PCM are two different nonvolatile memories.This paper proposes a write-aware data allocation approach by employing PCM+flash as hybrid main memory.According to data size and the number of read and write operations,the approach finds a proper memory for each data to reduce energy consumption and unnecessary copy-andwrite operations.For the sake of effective management of PCM and flash and improvement of their lifetime,we propose BBWL algorithm to manage the update and write operations on PCM,and propose RGC algorithm to manage the invalid pages of flash.The simulation experimental results show that the proposed approach can substantially reduce energy consumption and prolong lifetime of the proposed hybrid main memory.
Keywords/Search Tags:Data allocation, Energy consumption, hybrid memory, ILP, multi-core, NVM, Task scheduling, Time constraint
PDF Full Text Request
Related items