Font Size: a A A

Design And Research Of Multicore Processor

Posted on:2011-01-27Degree:DoctorType:Dissertation
Country:ChinaCandidate:J M LiFull Text:PDF
GTID:1118330332460589Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The overall improvement of a microprocessor performance is the uppermost goal that a wide range of researchers in computer sciences have pursued for a long time. The rapid development of the semiconductor manufacturing level provides a wide space for effective chip designs. Currently, how to effectively use those growing resourses on-chip to develop faster, more efficient and widely applied microprocessors is one of the vitally important topics for further improvements of a computer architecture.Aiming at the goal of improving the efficiency of the processor, the key techniques included were studied in depth, and the main research results were acquired as followings.In order to solve the issues of difficulty increased clock frequency of the processor and superscalar pipeline stall, an architectural design scheme of the LBC architecture of heterogeneous multi-core processor was introduced, in which a Loop detector, a special instruction queue called Backup Ins Queue, a C-Core controller, and fast data shared channel used among E-Cores called C-Bus were designed. In this type of LBC heterogeneous multi-core processor architecture, not only loop programs existed a lot in programs were optimized, but also the pipeline flush due to branch miss-prediction was avoided, therefore, the overall efficiency of the processor was improved;The MSI protocol and the MESI protocol were analyzed and investigated, the drawbacks in access time, access delay and bus burden were indicated, then a scheme that SC-Cache was added to the original CMP architecture was presented, which is used to store the block information which includes the copies which the multiple processors share. For the collaborative management among the increased SC-Cache, other Caches and the main memory, a CSC monitoring protocol was designed. Simulation test data showed that the design optimized the overhead on the realization of Cache Coherence , the performance of the whole storage has got a promotion;After statistical analysis on the data of the branch Characteristic libraries, it was found that the most of branch programs are loop type of programs (ie, loop programs). In the current design of micro-processor architecture, however, loop-based program is not well optimized. Therefore, a structural design scheme of loop detector was proposed, avoiding repeated decoding on the loop-type programs by the processor;Through the statistics on prediction accuracy for GAs two-level dynamic branch predictors in Superscalar pipeline, it was found that there is about 6%~16% forecast errors in the instruction branch prediction, while it usually takes three clock cycles to recovery the pipeline after a prediction failed. In order to solve this problem, a structural design of B-Cache branch prediction error recovery was proposed, which made the instruction prediction error recovery time reduced from three clock cycles down to one clock cycle.
Keywords/Search Tags:Processor, Multi-core, Architectural Design, Cache Coherence Protocol, Instruction Branch Prediction
PDF Full Text Request
Related items