Font Size: a A A

The Design And Verification Of 32bit High-Performance DSP SIMD Vector Memory

Posted on:2016-02-04Degree:MasterType:Thesis
Country:ChinaCandidate:P W XuFull Text:PDF
GTID:2348330509960510Subject:Software engineering
Abstract/Summary:PDF Full Text Request
DSP(Digital Signal Processor) is the core and engine of the Digital Signal processing system. With the development of integrated circuit process and the constant progress of chip design technology, the main frequency of DSP becomes more and more high, the growth of CPU computing power is much faster than the improvement of memory performance, the increasing "storage wall" problem seriously restricts the improvement of processor performance. How to develop a higher level of parallelism and thus further enhance the microprocessor performance has become the main challenge of the high-performance DSP design. Wireless communication, image processing, video and other media processing embedded applications have a large amount of Data Level Parallel(DLP), applying SIMD(Single Instruction Stream Multiple Data streams) structure to fully develop its DLP has become the important development direction of high-performance DSP architecture. How to provide continuous high bandwidth SIMD parallel access data for multiple SIMD DSP computing units, and design a high-efficiency vector memory with fewer parallel conflicts, less hardware overhead has become an important problem during the design.M-DSP is a high-performance multi-core DSP independently researched and developed by our school, it has self-designed instruction set structure and is mainly used in embedded systems of communication technology, image processing. The kernel structure of M-DSP applies the VLIW technology with 11 issues, includes 16 fully same-structure VPE(Vector Process Element). Aiming at the accessing requirements of VPE, this paper designs and implements an on-chip mass capacity vector memory with SIMD structure, realizes the parallel accessing operation of DMA and vector access instructions with high bandwidth and low conflict rate. The main work and innovations of this paper are as follows:1, This paper analyzed some typical applications and algorithms, and based on this optimized the VM address decoding process to make VM support unaligned access; We designed a set of vector access instructions which support multiple addressing mode and accessing grains, including the specialized shuffle access instructions to accelerate FFT algorithm.2, This paper designed and implemented two vector instructions accessing pipelines and DMA accessing pipelines, to make VM support 4-road parallel access requests including 2-road SIMD unaligned vector accessing instruction, the DMA read and DMA write.3, This paper designed specialized vector length register, the programmer can forbid some VPE which didn't participate in computing by configuring the register, thus provide the support of SIMD accessing with variable vector length.4, This paper designed the specialized DMA write interface, this interface contains eight independent write channels, makes each channel of a DMA accessing be able to access any address of corresponding address range. In addition, we specially designed three vector Store instructions, by configuring the internal register copy of VM to control DMA configuration registers, help programmers to configure DMA faster.5, We synthesized the VM with 40-nm process library, analyzed and optimized the timing violation path in reports by using the typical timing optimization techniques, to make the main frequency of VM reach 1GHz; Analyzed the area composition of VM and found the composition scheme of memory bank with minimum area.6, Based on System Verilog verification methodology, we built the hierarchical VM verification platform, made module level verification for VM on a higher abstract level, improved the verification efficiency in the premise of guaranteeing the completeness and correctness of function verification; Completed the system level verification of VM in M-DSP single-core environment. The verification result is correct, and the code coverage is close to 100%.
Keywords/Search Tags:Vector Memory, SIMD, FFT, Shuffle, Unaligned Accessing, Accessing Conflict, Verification
PDF Full Text Request
Related items