Font Size: a A A

Key Techniques On Data Stream Speculation For Heterogeneous Multi-Core Digital Signal Processors

Posted on:2008-09-23Degree:DoctorType:Dissertation
Country:ChinaCandidate:D WangFull Text:PDF
GTID:1118360242999350Subject:Microelectronics and Solid State Electronics
Abstract/Summary:PDF Full Text Request
Heterogeneous multi-core Digital signal Processors (DSP) are a class of powerful and flexible embedded SoC processors integrated with multi-cores of DSPs and other processors and with different computing tasks assigned to different processor cores for parallel processing. As is used for data-intensive applications, heterogeneous multi-core DSPs need higher memory bandwidth, more flexible memory structures and more powerful data-paths on-chip than normal DSPs. How to release the pressure of the memory wall problem on the performance and scalability of heterogeneous multi-core DSPs is one of the most important research projects for the architecture exploration of heterogeneous multi-core DSPs.Data speculation techniques are efficient approaches to improve the parallelism between computing and data accessing and to release the memory wall. They can decrease many access misses and hide the latency of remote accesses by initiating the remote data accesses as a speculation and bringing those required data into local memories (e.g. data caches) near to processors in advance. This thesis explores several techniques on data streams speculations according to the data streaming characteristics of heterogeneous multi-core DSPs to increase the data accesses bandwidth, optimize the memory hierarchies on-chip and improve the data transmission paths, and makes detailed performance analysis and evaluations for every technique based on our heterogeneous multi-core DSP platform, SDSP and PolyDSP. The main contributions and innovations of this thesis are as follows:1) Under the supports of our DSP research team, a heterogeneous multi-core DSP super-node based on shared memory structure, SDSP, is designed, and a larger scale multi-core DSP system, PolyDSP, is constructed by interconnecting multi-SDSP with network-on-chip. This thesis consummates the synchronization and communication schemes and draws the programming frame and parallelization mapping approaches for SDSP and PolyDSP.2) Comprehensive analyzing of the data streaming characteristics of typical multi-core DSPs applications is made. The analysis results show that the data accesses of DSP cores, the shared data among different DSP cores and the missed data blocks in cache coherence all show apparent streaming characteristics. Moreover, the data streams shared among DSP cores present similar production and consumption orders and similar access locality.3) To decrease the misses of cache coherence accesses and hide the latency of remote accesses, a data streams clustered forwarding technique, DSCF, for multi-core DSPs with shared memory structures is proposed. DSCF employs customed hardware control modules to execute the inter-core forwarding operations requested by special software primitives, can transmit the data blocks needed by consumer DSP cores to their private data caches cluster by cluster and make the data transmission speed matched with their consumption speed. The experiment results show DSCF can decrease the miss ratio of cache coherence accesses and improve the system performance effectively.4) To optimize the memory hierarchies on heterogeneous multi-core DSPs, a fast close-coupled shared scratchpad storage technique for small scale multi-core DSPs is proposed and corresponding model, FCC-SDP is constructed. FCC-SDP is composed of multi-banks of low capacity scratchpad memory modules, equipped with a synchronization framework based on hardware signal lamps, and supports parallel accessing of multi-DSP cores and point to point events synchronization. The experiment results show that FCC-SDP has obvious performance advantage against VS-SPM. The data mapping method of FCC-SDP combined with shared cache structure can improve on-chip data reusability and enhance the system performance further.5) To improve the system management efficiency on data streams, a Data Streams Transmission Control Engine (DSTCE) is designed for heterogeneous multi-core DSPs, and a method to implement data steams speculative transmission based on DSTCE is proposed. DSTCE is implemented with programmable background transmitting ability, and optimized for the data streams transmission among heterogeneous processor cores, the communication among different super-nodes and the parallel programming of the whole system. This thesis provides customed speculative operation primitives to implement data steams speculative transmission among different module ports.6) The external memory control interface (EMCI) of SDSP is designed and a data stream prefetch technique based on chain tables for access bandwidth optimization is proposed. EMCI can support high-speed DDR2 memory modules and several asynchronous memories simultaneously. To prefetch data streams from external memories, two prefetch buffers based on chain tables structure are constructed to predict and prefetch the data streams related with L2 cache missing. Compared with the two existing methods, our method shows satisfied prefetch hit ratios, prefetch effectiveness and performance speedup, and priodes better cost-efficiency.
Keywords/Search Tags:Heterogeneous Multi-core DSP, Memory Wall, Data Speculation, Data Streams, Data Prefetch, Data Forward, Scratchpad Memory
PDF Full Text Request
Related items