Font Size: a A A

FT64 Stream Processing Technology: Architecture, Programming Language, Compiler And Programming

Posted on:2008-10-19Degree:DoctorType:Dissertation
Country:ChinaCandidate:X B YanFull Text:PDF
GTID:1118360242999233Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Modern semiconductor technology makes arithmetic inexpensive and bandwidth expensive. In order to exploit the shift in cost, a high performance computer system must exploit locality, to raise the arithmetic intensity (the ratio of arithmetic to bandwidth) of the application as well as parallelism to keep a large number of arithmetic units busy. Traditional programming model (such as Open MP and MPI) and processor architecture (such as Multi-core processor architecture) always focus on the exploit of parallelism, and make insufficient support with the exploit of locality (such as producer-consumer locality).Stream programming model fulfills both of the two requirements. It exposes large amounts of parallelism across stream elements and reduces global bandwidth by expressing locality within and between kernels. Stream architecture exploits the parallelism exposed by stream programming model, by providing 100s of arithmetic units, and exploits the locality of a stream program, by providing a deep register hierarchy.This thesis puts forward our study on architecture, programming language, compiler and programming method. Our contributions are shown as follows:1) We design and implement a 64-bit stream processor for scientific computing, Fei Teng 64 (FT64), which has a peak performance of 16GFLOPS. FT64 carries out the optimized and scaled design on the aspect of Instruction Set Architecture, Stream Controller, Micro Controller, ALU Cluster, Memory Hierarchy, Network Interface and Host Interface. For example, FT64's instruction set architecture includes multiple fused multiply-add instructions. Also, two kinds of communications, message passing and stream communications, are employed for FT64-based high performance computers.2) We design and implement Stream FORTRAN 95 (SF95), a stream programming language, for scientific computing. By extending FORTRAN 95, SF95 includes 10 compiler directives, coded as comments. These compiler directives precisely characterize the features of the stream processor architecture by defining basic streams, derived streams, and kernels.3) We design and implement a compiler, SF95Compiler, for SF95 programming language. It integrates many compiler technologies to support scientific computing, including Stream Transformation, Code Optimization and Inline Transcendental Function Library.4) We present a method to transform traditional programs to SF95 programs to enable stream program design and inherit existing scientific programs. First, we define loop streamization and streamizable by comparing with the definitions of vectorization and parallelization. Then, based on the theory of dependence analysis, we demonstrate the relations between stream loop and serial loop, vector loop as well as parallel loop, and provide the method to transform these three kinds of programs into SF95 programs.We perform experiments on FT64 with nine typical scientific application kernels, including 3 NPB benchmarks (EP, MG and CG), one SPEC2000 benchmark (Swim) and 5 important scientific application kernels (FFT, Laplace, Jacobi, GEMM and NLAG-5). The results show that for eight of them, FT64 performs equally to or better than Itanium 2.
Keywords/Search Tags:stream architecture, stream programming model, stream programming language, compiler and optimization technology, stream programming
PDF Full Text Request
Related items