Font Size: a A A

Research And Implementation Of Reconfigurable Computing For Communication Baseband Signal Processing

Posted on:2016-07-21Degree:DoctorType:Dissertation
Country:ChinaCandidate:C MeiFull Text:PDF
GTID:1318330482974064Subject:Electronic Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of communications, computers, and microelectronics technology, the wire-less communication standards show the trends of continuous evolution and the coexistence of multi-standards. The baseband platforms should possess both high computation capability and flexibility to meet the require-ments of the trends, which is a huge challenge for architecture designers. The state-of-the-art communication protocols, such as LTE-A, WiMAX, Wi-Fi, are all adopting Multiple-Input Multiple-Output Orthogonal Fre-quency Division Multiplexing (MIMO-OFDM) systems. The Coarse-Grained Reconfigurable Architecture (CGRA) can meet the computing requirements of various algorithms by reconfiguring the hardware after sil-icon design. Combining the high flexibility of general purposed processors and the energy-efficient ASICs, CGRAs are ideal candidates for baseband processing of wireless communication.Based on the detailing and extending of traditional models, a systemic accurate analytical model for CGRA is established. According to the insight provided by the analytical model, a domain-specific high-performance CGRA named RaSP-BB (ReconfigurAble Signal Processor BaseBand) is designed for commu-nication baseband applications. When processing kernel MIMO-OFDM algorithms, the execution efficiency of the computation array as well as the performance of memory accessing is very high. The main work and innovations are listed as follows:(1) An analytical model for CGRA is established, including a loop pipeline model base on bubble analysis, a memory accessing model based on accessing weight analysis, and a multi-tasks synchronization model base on algorithms configuration analysis. Based on the abstraction of algorithms'characteristics and micro-architecture parameters, the model analyzes how the system perfor-mance is influenced by the software/hardware parameters quantitatively. Meanwhile, the CPI (Cycles per Instruction) stack provided by the model gives the insight of the architecture design through locating the bot-tleneck of system performance. (2) The computation array of RaSP-BB is optimized, a multi-layer routing structure is proposed, and the interfaces of the array are improved. The accumulation features has been added up on the traditional crossbar routing structure in order to better matching the data flow characteristics of algorithms. Based on the locality analysis of data, the input and output interfaces are designed with various sources destinations. The proposed methods not only improve the utilization ratio of processing elements, but also reduce the pipeline bubbles. (3) The hierarchical memory architecture array of RaSP-BB is optimized, a conflict-reducing memory structure of shared memory is proposed, and the multi-mode self-adaptive memory structure of local memory is improved. In the shared memory, algorithm mapping mechanism as well as the address re-mapping scheme can be configured to fit the characteristics of algorithm. The multi-mode mem-ory is designed to support various accessing patterns, such as transposition, overlapping, combination, etc. Thanks to these methods, not only the data transmissions between different computing engines are eliminated, but also the accessing delays caused by conflicts and patterns are reduced.The RTL simulation of RaSP-BB has been implemented to verify the proposed design methods of re-configurable architecture for baseband signal processing. By comparing the RTL simulation and the model results, the evaluation accuracy of the analytical model is 94.52% and 93.83% on CGRA REMUS-? and RaSP-BB respectively. Implemented on TSMC 45nm technology, when the system frequency of RaSP-BB is 400MHz, the performances of kernel algorithms in wireless communication baseband processing are shown as follows.The 4096-point FFT requires 3295 cycles(8.23 ns), the K-best algorithm achieves 1.072 Gbps throughput in 4 × 4 MIMO detection system (K=3), the Mapping and De-mapping need 0.55 and 0.28 cy-cle/symbol separately, and the processing speed of 32-Tap FIR is 2.4 cycle/symbol on average. Compared with REMUS-?, RaSP-BB improves 39.98% performance on average. Compared with the state-of-the-art CGRAs, the proposed RaSP-BB architecture achieves high performance for complex algorithm such as FFT and MIMO detection; meanwhile it provides flexibility for other communication algorithms, such as FIR and mapping.
Keywords/Search Tags:Coarse-Grained Reconfigurable Architecture, Analytical Model, Algorithms of Communication Baseband Processing, CGRA Computation Array, CGRA Memory, Sub-system
PDF Full Text Request
Related items