Font Size: a A A

Research On Key Technology Of Multi - Core Processor

Posted on:2014-04-05Degree:MasterType:Thesis
Country:ChinaCandidate:P OuFull Text:PDF
GTID:2208330434972789Subject:Microelectronics and Solid State Electronics
Abstract/Summary:PDF Full Text Request
With the increasing complexity of communication and multi-media, traditional ASIC based solutions are no longer suitable, due to their low flexibility and high design cost. Instead, systems based on processor, especially multi-core processor, are drawing more and more attention. Processor is easy to program and has high flexibility. Compared to design an ASIC chip, the implementation cost of processor based solutions is much lower. But up to now, its performance and energy efficiency is still much lower than ASICs and sometimes cannot meet the requirement of modern applications.To solve these problems, this paper tries to analyze some key technologies in processor design and reduce the efficiency gap between multi-core system and traditional ASICs by optimizing the processor for two specific domain, multi-media and communication. The main work of this paper is listed below:1. Cluster based multi-core architectureThe cores in the multi-core processor proposed in this paper are organized in cluster based architecture. The cores in the same cluster are tightly coupled with each other. They can exchange data efficiently and share some resources such as memory, execution array and so on. Though all the cores are the same, there are different kinds of clusters. Some clusters have large data memories, while others contain some hardware accelerators for communication and multi-media. Every cluster has its own clock source, so different cluster may work at different frequency according to its workload.2. SIMD instruction set enhancement and pipeline structure optimizationBesides supporting common MIPS instructions, a verity of SIMD instructions are added in this processor to enhance the ability of parallel computing. On the other hand, for convenient access to some hardware functions, we also add several special instructions, such as check, regconfig, etc. Meanwhile, the structure of pipeline is also changed to satisfy the need of different bit width SIMD instructions, including splitting data path, optimizing by-pass module, and so on.3. Multi-page foreground and background register fileThis paper proposed a novel multi-page foreground and background register file architecture. It contains3X more registers, while still using5bits for address width. At one time, only32registers can be used by the pipeline, which are called foreground registers. The other registers are called background registers. When the pipeline are processing the data in the foreground registers, the background registers can pre-fetch data or write results back from/to memory with the help of DMA. Thus, most load/store instructions can be replaced by background data movement. Besides, many special functions are directly mapped in the register file for convenient access.4. Global network-on-chip local shared memory communication strategyIn the proposed multi-core system, we use a global network-on-chip local shared memory inter-core communication strategy. The advantage of communication base on shared memory is easy to use and program, while network on chip is more flexible and scalable. In our system, programmer can use different methods in different situation, so the whole communication efficiency is higher.5. Double layer network-on-chip combining packet switch and circuit switchThe structure of network-on-chip is optimized in this processor. We combine the traditional packet switch and circuit switch together, and use request packet to build the communication channels in the circuit layer. In this way, it can achieve both high energy efficiency and high flexibility. The cores can exchange data with network-on-chip easily. Background data movement with DMA is also supported.6. Chip design and testAfter designing the architecture, we also finished the backend flow and made a real multi-core processor chip. This chip uses TSMC65nm LP technology and contains about20million transistors. Its typical frequency and power are850MHz and22mW per core, so the energy efficiency is39GOPS/W. In addition to performance test, some real applications, such as H.264intra decoder and key modules in LTE system, are also mapped in our processor and all achieve good performance.
Keywords/Search Tags:Multi-Core Processor, Inter-Core Communication, Network-on-Chip, Cluster-Based Architecture, Single Instruction Multiple Data
PDF Full Text Request
Related items