Font Size: a A A

Research On The Key Technologies For Chip Multi-Processor

Posted on:2012-08-21Degree:DoctorType:Dissertation
Country:ChinaCandidate:X L GuFull Text:PDF
GTID:1118330371456287Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
By the 2000's the growth in single-processor performance has stalled for the "power wall" and limited instruction level parallelism (ILP) in single-threaded application program. The chip multi-processor (CMP), which has powerful parallel processing capability, higher data communication bandwidth, better resource ultilization and good expansibility becomes the new trend of processor's performance growth based on the Moore's law. However, being as a new technology, the CMP also brings many software and hardware challenges which slows down the development of CMP. The goal in this thesis is to improve the CMP system's efficiency. We consider these problems from the view of system and study three key technologies in CMP design: the short message passing on-chip network, the high efficient scheduler of multi-core operating system and the control processor with higher single-threaded performance and higher task throughput.Firstly, for the nearer physical positions of processor cores, this thesis proposes the short-message passing network, which is constructed by the shared register clusters, to manage the transmission of synchronization message and broadcasting data. By exchanging message quickly, this network can meet the requirement of lower transmission latency for short messages. At the same time, this network has the characteristics of lower physical delay, small die area and lower power, all of which make the network can be easily integrated in the CMP system. The experimental results show that the short-message passing network reduces the task invoking overhead and data broadcasting overhead, which brings 5.62% to 25.35% performance improvement for the five scientific kernel applications while the augmentations in area and power are both less than 1%.Secondly, this thesis suggests a scheduler of master-slaver real-time operating system (RTOS) to efficiently manage the task/thread running for the distributed CMP system. By defining the protocol between the applications and RTOS, the difficulties of parallel programming is reduced partly which the programmers only need to supply the contents defined in the App-RTOS protocol and do not need to care the communication and synchronization problem. Instead, the proposed scheduler manages these operations. And the scheduler also classifies the invoking flows of task based on the features of scientific application program and data flow graph (DFG) programming model, which obviously reduces the invoking overhead of task. The efficient RTOS-network functions are defined in the scheduler design, which manage the network accessing with lower overhead. The experimental results show that by reducing the invoking overhead and communication delay, the suggested scheduler improves the CMP system performance by 5.25% to 19.62%, which results the higher performance of the five scientific kernel application programs. The memory requirement of the proposed scheduler only needs 5.38KB.Finally, the thesis presents a novel processor architecture which combines the out-of-order (O-O-O) superscalar technology and simultaneous multi-threading (SMT) technology to manage the running of CMP system. By extracting the instruction level parallelism (ILP) from the single-threaded application program,O-O-O superscalar technology efficiently improve the processor's single-threaded performance. And the SMT technology does not only improve the processor's task throughput but also improve the execution resource utilization. The experimental results show that the execution of serial part in application program is accelerated which delivers 1.13% to 10.4% performance improvement of the five scientific kernel application programs; the SMT technology also improves the throughput of chip which can obtain 1303.9DMIPS/mm2.
Keywords/Search Tags:chip multi-processor, networks-on-chip, real time operating system, out-of-order superscalar, simultaneous multi-threading
PDF Full Text Request
Related items