Font Size: a A A

Research On Key Technology Of Reconfigurable Manycore Stream Processor Architecture

Posted on:2013-12-28Degree:DoctorType:Dissertation
Country:ChinaCandidate:M XuFull Text:PDF
GTID:1228330377951861Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
When the semiconductor technology developed into the deep sub-micron process era, the traditional multicore processor design encounters power, wire delay, scalability and many other issues, which restricts the performance upgrade of the traditional multicore processors. Furthermore, the demand on hardware resources present differences and phases during the execution period, fix architecture design is difficult to guarantee the efficiency of resource allocation, which will inevitably cause the mismatch between the resource requirements and the actual allocation of resources. Therefore, it seems imminent to design new processor architecture not only consistent with the development trend of semiconductor technology, but also meet the phase characteristics of the applications.In this dissertation, we systemically studied the technical issues of reconfigurable manycore architecture including programming model, execution model, instruction set and hardware structures, then proposed and demonstrated a reconfigurable manycore stream processor based on data-flow like driven execution model. We analyzed the hardware architecture and programming model of state-of-the-art stream processors, refining the trends and key technology of the reconfigurable manycore stream processors. In the software level, we proposed an execution framework which supports the reconfigurable manycore stream processors, in the hardware level, we proposed a tiled reconfigurable manycore stream processor called TPA-S and designed the accompanying on-chip L2cache structures. The main research content and the research achievements can be summarized into the following four aspects:We study the characteristics of stream applications and stream processing models, proposed a data-flow like driven execution model based on hyperblock and designed an instruction set called DISC-S. In DISC-S instruction set, program use hyperblock as basic unit to execute programs, the instructions in a hyperblock are driven by data-flow, and support the explicit management of on-chip memory hierarchy. Then we proposed the depth-first mapping strategy and breadth-first mapping strategy for CUDA programming model to DISC-S ISA, and discuss the advantages and disadvantages of the two different mapping strategies.We study the hardware model for the data-flow like driven execution model, propose the reconfigurable manycore stream architecture called TPA-S which support the DISC-S ISA, and implemented the runtime reconfigurable mechanism for logic processors. TPA-S processor use data-flow like driven execution model, integrating lots of fine-grained homogeneous physical cores which contain independent computing and storage components, and combine several physical cores together to generate more coarse-grained logic processor. The interconnection topology between physical cores is two-dimensional mesh fabric based on routing nodes, and use data packets to achieve the data transmission between different physical cores.We study the impact on the processor performance with different hardware configuration parameters of the TPA-S processor, explore the design space of the TPA-S processor, and analyze the key factors affecting program execution performance in the TPA-S processors. We select10real world applications as benchmark to examine the performance of the TPA-S, evaluate the different configurations of the calculate components and storage components in physical cores, discuss different bandwidth and latency impact on the performance.We study and propose two different on chip L2cache designs which based on UCA architecture and Mesh-SNUCA architecture for TPA-S processor, using the clock accurate software simulator to evaluate the performance of the two different L2Cache architectures. Then we discuss the reconfigurable mechanism based on Mesh-SNUCA architecture, by modifying the address mapping tables in on chip L2cache and physical cores, we can achieve the runtime reconfiguration for the on chip L2cache.We also obtain some important insights during this research:(1) different kind of applications have obvious differences in the demand for resources and most applications have obvious phase-feature, it is natural to use logic processor reconfigurable technology to adopt these features.(2) it is possible to use data-flow like execution model to develop the instruction level parallelism in stream processors.(3) the performance of on chip network is a key factor to affect the performance for the tiled manycore architecture.In this dissertation, we proposed a data-flow like reconfigurable manycore stream processor called TPA-S, and do some research on hardware architecture, programming environment and reconfigurable mechanisms. The research achievement can be referenced for the design of high performance stream processors using manycore architecture.
Keywords/Search Tags:manycore, reconfigurable, stream processor, data-flow like driven
PDF Full Text Request
Related items