Font Size: a A A

Design And Optimization Of Heterogeneous Multi-Core Processing Chip

Posted on:2013-05-15Degree:MasterType:Thesis
Country:ChinaCandidate:S ZhouFull Text:PDF
GTID:2248330371988171Subject:Microelectronics and Solid State Electronics
Abstract/Summary:PDF Full Text Request
Heterogeneous multi-core is the trend of today’s multi-core processor design. Its key idea is that one (or several) general-purpose core in the processor handles the task scheduling, while dedicated computing cores handle main computing tasks (such as floating-point operations, signal processing, image processing, etc.) to improve the efficiency and performance of processor. There are many factors that can affect the performance of heterogeneous multi-core processors, architecture and functionality of the cores are the most important. In this paper, a heterogeneous multi-core processing chip is introduced. Using NoC(Network on Chip) as its top-level architecture, this chip integrates52heterogeneous cores including ARM, Coprocessor, FFT/IFFT Accelerator and Matrix Transpose Accelerator. Experiment results of implementing this chip on FPGAs show that it meets the real-time requirements of the imaging algorithm.Based on the original designs of this heterogeneous multi-core processing chip, some optimizations have been done in this paper.For the NI (Network Interface), this paper presents a design method based on micro-code controller and the realization of a new NI that supports three kinds of link communication protocol. Because it can be programmed using micro-code, this NI has strong flexibility and adaptability. Compared to the original design based on FSM(Finite State Machine), the overall hardware resource consumption of the new NI is reduced by about10%.For the Sin/Cos Computing Unit, this paper theoretically analyzes the computing deviation of the original design and proposed a new algorithm which improves the precise of phase by compensation. Based on this algorithm, a high-precision Sin/Cos computing module is proposed, and this module improves the accuracy of Sin and Cos significantly. Optimization on the representation format of data has been done in order to save hardware resource. Logic synthesis results show this new design reduces about32%hardware resource consumption.For the Matrix-Transpose Accelerator, this paper discusses the method of transposing large matrix in a distributed memory system and shows a improved design of the Transpose Cluster (including the Matrix-Transpose Accelerator). Theoretical analysis and experimental results are both indicating the new design can increase the speed of transposing matrices greatly while reducing about15%hardware resource consumption. As there are many factors (such as the size and shape of transposed matrix, method chosen to divide large matrix into smaller ones, the depth of Buffer, etc.) affecting the performance of Transpose Cluster, some statistical results on them have been derived from experiments providing a reference for the efficient utilization of this cluster.
Keywords/Search Tags:Heterogeneous Multi-Core, Network on Chip, Network Interface, Sin/CosComputing Unit, Matrix Transpose
PDF Full Text Request
Related items