Font Size: a A A

Data Optimization In Parallel Compilation For Heterogeneous Multi-core Processor

Posted on:2015-09-22Degree:MasterType:Thesis
Country:ChinaCandidate:Y B LiFull Text:PDF
GTID:2308330482979073Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In recent years, heterogeneous multi-core processor develops rapidly and has been increasingly used in high-performance computing. Heterogeneous multicore processor integrates processor cores with different structures on a single chip. So different types of tasks can be assigned to different processor cores to perform tasks more efficiently. But it also brings more challenges, such as programming problems and performance optimization problems.The solution of these problems is the key to play a heterogeneous multi-core processor performance advantage.Using parallel compilation technology is an effective way to solve the programming and performance problems. This thesis item selects OpenACC programming model for heterogeneous multi-core processors, and developes a "source-source" parallel compilation system Auto-ACC for heterogeneous multi-core processors based on Open64, which automatically tanslates serial pragrams to parallel OpenACC pragrams. Heterogeneous multi-core processors usually have complex multilayer storage system, so the optimization of data storage and transmission is the key technology of parallel compilation for heterogeneous multi-core processors. Detailed researches are done about data optimization in this thesis. The main works and creations are as the following:(1) Data optimization architecture for heterogeneous multi-core processor is improved. Based on existing array blocking method in data optimization architecture, a loop tiling optimization method is designed and implemented, to solve the problem of the original array blocking method that the complexity is high when accessing array in large and complex way. And loop tiling can effectively improving program data locality. Based on this, the new cross-block data transmission mode is added, which makes data transmission more accurate and efficient.(2) A loop tiling method for heterogeneous multicore processors is proposed. Loop tiling is a commonly used method to improve data locality. The loop tiling method proposed by this thesis is implemented by adding pragmas in the program. Compared with the previous loop tiling methods which are based on program transform, this loop tiling mothed needs no complex program data dependency analysis and is more simple and efficient, and need not nested loops permuted so the use of range is wider. A loop tiling clause generation algorithm for heterogeneous multicore processors is proposed and implemented in Auto-ACC.(3) Cross-block data transmission for heterogeneous multi-core processors is implemented. Via extending the OpenACC data copy clause, a cross-block data transmission guided by OpenACC pragmas is implemented. And based on the polyhedral representation of program, extended data copy clause automatically generation is implemented in Auto-ACC. Cross-block data transmission can achieve more accurate transmission of data blocks based on loop tiling, which improves the utilization of the device memory and reduces redundant data transmission.Data optimization methods proposed in this thesis have being implemented in the Auto-ACC system. And test results have showed the effectiveness of the method.
Keywords/Search Tags:Heterogeneous Multi-core Processor, Parallel Compilation, OpenACC, Data Optimization, Loop tiling, Cross-block data transmission
PDF Full Text Request
Related items