Font Size: a A A

Research On Memory Mapping Methods Of Reconfigurable And SIMT Processor System Architectures

Posted on:2018-04-03Degree:DoctorType:Dissertation
Country:ChinaCandidate:F HanFull Text:PDF
GTID:1318330542474311Subject:Electronic Science and Technology
Abstract/Summary:PDF Full Text Request
Since Moore's Law has been introduced,researchers in the field of integrated cir-cuit are working on following its prediction.When comes to the more than Moore era,researchers change their focus from increasing integration of integrated circuit to exploring the combination of new architecture and new technology.Lots of innova-tions and developments have been made in high performance processor architecture design which is an important research area of integrated circuit.Processor architecture with high processing capacity becomes a hot research topic,includes reconfigurable ar-chitectures for specified field with high performance required,single instruction multi-threading architectures for highly parallelism applications;and many core architectures with network on a chip(NoC)interconnection.As storage resources become larger in a processor,how to efficiently utilize s-torage resources is a critical issue for processor design.In data intensive application processor design,multi-bank designs of on-chip storage is usually used to increase memory bandwidth.Bank conflicts will severely hinder the bandwidth enhancemen-t from multiple banks.Optimization of memory address mapping more efficiently exploits memory bandwidth by reducing the bank conflicts with reasonable data distri-bution.According to the application characteristics mapping computing and off chip accessing tasks can overlap each other to utilize the computing and accessing resources simultaneously.The researches of this paper focus on issues of memory mapping in reconfigurable and SIMT processor architectures.The main contributions are as follows:1.We propose a high performance coarse grained reconfigurable architecture for digital signal processing.Based on this architecture,we implement a digital signal processor,coarse grained reconfigurable digital signal processor(CRDSP).According to the application requirement and access latency,area cost and power cost analysis of different memory bank organization,CRDSP adopts 2MB SRAM(Statics Random Access Memory)that divided into 32 banks as on-chip storage.The proposed archi-tecture has been fabricated on 40 um CMOS process.The clock frequency of CRDSP is 1 GHz and its peak computing ability is 69 GFLOPS with less than 1.2W power consumption.2.For fast fourier transformation(FFT),one of the most commonly used applica-tions in digital signal processing,some advanced optimizations are made in CRDSP,including the development of radix-2/4/8 butterfly unit and its bank conflict free access;the optimization of twiddle factor generation unit and double buffering off chip access scheduling for ultra long series.CRDSP supports from 128 to 1M points FFT process-ing.The computation time of 1K,32K and 1M points FFT are 2.57?s,82.25?s and 7.4ms,respectively.CRDSP outperforms other advanced FFT processors with simi-lar normalized area.Moreover,with the increasing of series scale,CRDSP get more performance advantages.3.The NoC based multi-core architecture is also one of the key research areas for high performance processor architectures.Multi-core reconfigurable architectures are a potential ways to handle higher performance requirements and larger scale ap-plications in future digital signal processing.We introduce a multi-core data migrate enhanced reconfigurable processor(DMERP)architecture based on CRDSP.DMER-P adopts hybrid on chip memory architecture with cache and programmable memory.Cache associativity,cache line size,inclusive and exclusive organization and level 2 cache shared range are simulated with a full system simulator and a memory mod-el to analyze different cache organizations in system cores.It is important to exploit the system efficiency to map application flow to hardware resources according to the Application storage characteristics.A series of application memory mapping of large-scale digital signal processing applications are studied to reduce off chip access by transferring data with specific data migrate path.The implement of this architecture is one of our key works in future.4.Shared memory in an single instruction multi thread(SIMT)architecture is a programmable on-chip memory.Generally,the bank conflicts hamper the performance of parallel accessing in applications.However,access requests can be better distribut-ed to different banks to avoid conflicts by different memory address mapping methods.We propose an adaptive mapping function,which tries to seek a mapping function according to the access patterns in the first batch of blocks executing of a kernel.Com-pared with a baseline graphic processing unit(GPU),the proposed adaptive mapping function reduces 94.8 percent bank conflicts and gets 23.5%performance speedup.
Keywords/Search Tags:Memory Address Mapping, Reconfigurable Computing, SIMT Processing Architecture, Shared Memory Mapping, Off Chip Memory Accessing Mapping
PDF Full Text Request
Related items