Font Size: a A A

A RTL Design Of 128-bit Data Permute Subset Instructions In Vector ALU Unit

Posted on:2017-08-18Degree:MasterType:Thesis
Country:ChinaCandidate:T ZhangFull Text:PDF
GTID:2348330488974650Subject:Engineering
Abstract/Summary:PDF Full Text Request
The constantly improving operating precision of operands intends to meet the higher requirements of processor's data processing ability. However, many types of calculations do not need such high operating precision practically, which lead to the inability to realize the full potential of the microprocessor design. Different with that conventional method, one alternative way to address this potential deficiency is the single-instruction multiple-data(SIMD) design. Most of the SIMD memory access units support only contiguous, word aligned memory access, which is different from what it is like in data memory access unit during the actual operations. So, the SIMD instructions set should extend an extra data permute module to accomplish the operation of data permutation before vector parallel operation.Based on the Alti Vec technology and the Power PC architecture, this thesis designs the data permute unit in Single Instruction Multiple Data stream parallel processing mechanism. Emphasis is put on the analysis of system architecture and instruction set constructed by Power PC. The Permute module has 128-bit operands, being divided into two pipelines. Verilog hardware description language has been applied to vector permute, completing 53 instructions in the Power PC_ISA instruction set,including Vector Pack Instruction, Vector Unpack Instruction, Vector Merge Instruction, Vector Splat Instruction, Vector Permute Instruction, Vector Select Instruction, Vector Shift Instruction, Vector Gather Instruction. In this way, the unmatched problem between the SIMD vector registers and the actual data load/store process is solved, meanwhile, the data permute operation before vector parallel processing is also implemented.The crossbar switch, accomplishing the data bytes select operations, is employed into the second pipeline to reduce the complexity of the circuit structure. This construction eliminates redundant circuits and optimizes the related performance such as module area. The non-blocking internal structure in this circuit organization guarantees the data transmission speed.
Keywords/Search Tags:SIMD, Vector Permute Unit, Power PC
PDF Full Text Request
Related items