Font Size: a A A

Research On The Design And Performance Optimization Of Memory System For Stream Architecture

Posted on:2008-10-19Degree:DoctorType:Dissertation
Country:ChinaCandidate:C Y MaFull Text:PDF
GTID:1118360278456527Subject:Electronic Science and Technology
Abstract/Summary:PDF Full Text Request
With the popularity of stream application and the development of VLSI technology, a series of challenges have been confronted in the field of traditional high-performance processor architecture. Stream architecture is a stream application oriented high-performance processor architecture. Stream architecture can fully exploit the parallelism and locality in stream application, which can result in the high-performance support for stream application.Stream architecture generally adopts software-managed stream memory system which is superior to traditional one when it is oriented to stream application, but the further research can rarely be found in the field of stream memory system. On the basis of characteristics of memory access in stream application and the relevant previous research achievements, how to develop memory access mechanism supporting stream application efficiently is still challenging. This dissertation focuses on a series of researches on stream memory system in stream architecture.In this dissertation, the main research object is stream memory system in stream architecture. We explore the existing researches on stream memory system thoroughly. Then we propose a new thought of stream memory system, and implement it in FT64 processor. Furthermore, according to the characteristics of data access in stream applications, we propose several optimization techniques of stream memory system based on FT64 stream processor.The main contributions in this dissertation are as follows:1. Memory systems in current computer architectures are analyzed and related works are discussed. Focusing on the architectures and access characteristics of hardware-managed cache memory system and software-managed stream memory system, we analyze the differences between them in bandwidth requirements, latency concealing, energy efficiency and software complexity.2. Aiming at the access characteristics of representative stream applications, we propose a new thought of stream memory system and implement it in the design of FT64 processor. The memory system of FT64 processor can be divided into three levels, it adopts memory bandwidth matching design to improve computing performance and to reduce bandwidth demand. It directly supports three access address generating modes: constant stride, indexed (scatter/gather), and bit-reversed.3. Aiming at data reuse in stream applications, we propose a stream data reuse oriented SDR-Cache structure, and optimize its performance by adopting FMB write-directly and life-time speculation techniques. SDR-Cache can realize the ITR and IPCL reuse of stream-level data by capturing them with the guidance of compiler. Filling cache with the cachelines that will be fully modified can be avoided if we use FMB write-directly technique. Life-time speculation enables the cache to invalidate the data that won't be used any more instead of writing them back into memory. These techniques reduce access delay in many stream programs evidently.4. In current chip design, since the big capacity on-chip memory can only be half-frequency accessed, we propose a virtual full-frequency access approach, which divides single memory storage into many banks controlled by clocks with different phases. Low-bit interleaved data access mode allows the storage to realize full-frequency pipelined access. Then the effective bandwidth is increased.5. To enhance the available cycle utilization efficiency of memory data bus, we propose a stream application oriented memory scheduling mechanism. According to the characteristics of stream data organization, the mechanism takes full advantage of memory bandwidth by using a two-dimensional data buffer to combine access requests.6. Through the analysis of access characteristics of stream applications, we propose a DRAM page strategy based on stream address analysis. By recording and analyzing the address distribution of waiting requests, the strategy can predict the incoming access situation of each bank, then enable DRAM bank to precharge appropriately.Results of experiments show that the proposed stream memory system design and related optimization mechanisms can reduce data access delay efficiently, and improve system performance evidently. This dissertation provides both theoretical and practical foundations for further improvement of stream memory system performance.
Keywords/Search Tags:stream architecture, stream memory system, multilevel memory hierarchy, FT64, SDR-Cache, data reuse, register file, virtual full-frequency access, memory scheduling, page strategy
PDF Full Text Request
Related items