Font Size: a A A

Research On 32 Bit High Performance Embedded CPU And Platform

Posted on:2010-01-21Degree:DoctorType:Dissertation
Country:ChinaCandidate:H T GeFull Text:PDF
GTID:1118360302489839Subject:Circuits and Systems
Abstract/Summary:PDF Full Text Request
As the most important IP in SOC design, Embedded CPU and its development technology have caused broad attention. This thesis made an intensive study of the core techniques for developing domestic 32-bit embedded processor named CKCORE, including its architecture, key modules, verification flow and related SOC platform, etc.A new micro-architecture for 32-bit embedded CPU was designed in this thesis firstly. It has solved several key techniques composed of out-of-order execution, fast retirement, branch prediction, write buffer and CPU hardening, self-owned high performance low power CK510 was also realized. In order to further enhanced CPU's performance greatly, a superscalar architecture processor CK610 was also researched, and the proposed supporting techniques included execution speculation, non-blocking instruction issue and data access, hardware reserved stack and dynamic configurable write-back cache. Both CK510 and CK610 hard core were implemented following the industry standard, and their main performance index has reached the same level compared with international leading embedded CPU.A SIMD framework based on segmented macro cell sharing was proposed for enhanced multimedia unit design. It was composed of basic multiplier and adder macros with different width hierarchically, which could solve such problems in traditional SIMD optimizations as redundant partial products and long latency carry chain. Moreover, a DSP extended unit for CKCORE was designed to improve the performance in multimedia applications. Several key techniques were proposed herein such as delay quantization based pipeline partition, fully pipelined execution and write back, non-blocking issue, out-of-order execution, and early retiring with delayed write back.Considering design of memory management unit, a fully synthesis MMU with grouped TLB matching was presented in this thesis. TLB was divided into several groups and physical address was searched in parallel among these groups. Pipelined TLB Matching improved the throughput of TLB accessing, and Start Address Prediction helped to reduce the latency of TLB searching. The high performance and low power two-level TLB accessing mechanism was used to obtain high access speed and low TLB miss rate. A dynamic page-merging technique was also proposed to improve address translation efficiency for each uTLB entry. A novel equivalence checking flow and its verification system ZDFV were put forward for CPU design and verification. ZDFV was composed of RTL-level verification tool, gate-level verification tool, together with RTL synthesis tool for verification. After intensive study on various verification engines, several new methods were presented to improve efficiency: independent cut set and quantization method, latch matching algorithm and mixed SAT flow method, etc.Finally, a new SoC development platform was built based on CKCORE CPU. This platform made use of SPIRIT standard for IP configuration and system integration with XML as meta data. Furthermore, any functional IP module with AMBA interface could be integrated into this platform rapidly and conveniently. With a given SoC architecture and its constraint, our platform is able to automatically generate both RTL simulation environment and FPGA emulation platform for hardware / software co-design, so as to guarantee the correctness and efficiency of SoC system integration.
Keywords/Search Tags:Embedded CPU, SoC Design Platform, Multi-media Enhancement, SIMD, Memory Management, Formal Equivalence checking
PDF Full Text Request
Related items