Font Size: a A A

Multi-granularity Partition And Scheduling Method Research For Stream Programs On Multi-CPU And Multi-GPU Heterogeneous Architectures

Posted on:2016-01-22Degree:MasterType:Thesis
Country:ChinaCandidate:W B ChenFull Text:PDF
GTID:2348330479953421Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Data flow programming language simplifies the domain programming and offers an attractive way to express the parallelism. However, the complexity of underlying computation and communication in multi-CPU and multi-GPU heterogeneous architectures is an important factor that influences the performance of dataflow applications. How to partition and schedule effectively has been the hot topics in parallel computing these years.For the problem of too much data parallelism, task parallelism and pipeline parallelism in multi-CPU and multi-GPU architectures, the paper proposed an efficient data flow compilation framework. The framework puts the synchronous data flow graph as the beginning input, uses many partition methods to distribute the tasks to multi-CPU and multi-GPU. According to the parallelism of the task and the communication, the tasks classification method assigns the tasks to GPU or CPU to execute. The GPU tasks horizon splitting method is proposed to divide the tasks which are distributed to GPU into many blocks and each block is executed by one GPU. The tasks horizon splitting method avoids the communication between GPU and GPU. The CPU disperse tasks balancing partition method chooses appropriate CPU cores and balances the tasks distributed to CPU to these CPU cores. The method satisfies load balancing and raises the utilization rate of CPU cores. The optimizations include storage optimization and object code optimization. Storage optimization utilizes multiple storage structures and multiple access types to improve the efficiency. An effective object code optimization method – Object Mode Template is proposed to reduce the redundancy of the object code.We choose multi-CPU and multi-GPU heterogeneous architectures as the experiment platform and the common algorithms in media processing applications as benchmarks. We evaluate the performance of partition methods and optimization methods. Our experiments show that the partition methods improve the performance of the data flow programs and the optimizations raise the access efficiency and decrease the redundancy of the object code.
Keywords/Search Tags:Heterogeneous architectures, Dataflow programs, Partition method, Storage optimization
PDF Full Text Request
Related items