Font Size: a A A

Chip Multi-Processor Modeling Method Based On FPGA

Posted on:2013-02-01Degree:DoctorType:Dissertation
Country:ChinaCandidate:Q LiFull Text:PDF
GTID:1228330377451861Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
With the development of chip multiprocessor, the research of computer architecture now faces new opportunities and challenges. On one hand, the performance gain of multiprocessor has changed from instruction-level parallelism to thread-level and data-level parallelism. In order to find the parallelism, we must break the frameworks of traditional software and hardware, and redesign them, including microarchitecture, programming model, compiler, runtime system and so on. On the other hand, software simulator, traditional used in single-core processor architecture research, can not meet the research demand of multi-core processor. Software simulator performance reduce proportionally to processor cores, thus can not support cycle accurate simulation, full system simulation and the research of system software. For the reasons given above, research in multi-core processor architecture lack abundant experimental evaluations and comprehensive guides. And software simulator becomes the bottleneck of multi-core architecture research. Thus, the key of doing research in multi-core processor architecture effectively is adopting a new simulator. The inherently parallelism of FPGA gives it better simulation performance and scalability in hardware level, and becomes an ideal simulation platformThis dissertation focuses on the ways of modeling chip multi-processor. The major research contributions include:(1) Based on the study of functional emulator, performance model and prototype, we propose a novel performance model framework where functions and timing are departed. Specifically, the functional partition is just responsible for correct simulation of the processor actions, without considering the microarchitecture and timing sequence. The timing partition models the microarchitecture of the processor, determines the time of processor actions, and drives the functional partition to simulate the corresponding microarchitecture. Due to the microarchitecture independent, one functional partition can be reused to multiple timing partitions, and it is compatible with other simulation patterns, including software simulations and cross platform simulations. This framework reuses off the shelf modules and saves previous modeling work effectively.(2) By studying the synchronization method of modules in simulator, a synchronization technique based on port is proposed. Port synchronization technique enables multiple modules in a model simulate different model cycles at the same time. In this way, modules with high speed do not need to wait for the ones with low speed, and thus promotes the system performance. The larger speed difference between the modules, the better port performances can be observed.(3) With the technique of software-hardware co-modeling, we propose the methods of adjusting FPGA resource occupation and simplifying modeling process. Since simulating chip multi-processor needs vast FPGA resource, we use software memory buffer technique to store data in the host computer, reduce the occupation of FPGA resource. It is very difficult to simulate complex structure in FPGA, so we use software-hardware co-modeling technique to ease this process. RTL code is complex in debugging and time-consuming in compilation, software simulation can be used to reduce the modeling complexity and the compiling time.(4) Time-division multiplexing technique is investigated and a fine grained time-multiplexing technique is proposed. We divide a module into two parts:state and logic, where we duplicate state for multi-core and reuse the logic part. Fine granularity time-multiplexing takes rule as the reuse unit and makes multiple cores simulate in one module at the same time. It also increase FPGA resource utilization rate.(5) The performance bottleneck is analyzed and several optimize techniques are proposed. These techniques include the delay statistic between functional and timing partition, and the delay statistic between modules in timing partition.(6) Based on all the above research, we implement a RAMP-Pink simulation platform. RAMP-Pink supports both transactional memory and thread level speculation. We adopted the alpha ISA, and provide a multi-thread creation mechanism to replace PThreads library. This mechanism can also be used on other multi-core simulation platforms without OS support. During deployment, a MESI Cache coherence protocol is designed and implemented.Through the research of processor modeling based on FPGA and RAMP-Pink system implementation, we have got some important conclusions about hardware modeling. Firstly, the key problem of software simulator for multi-core processor is that it does not scale well while the number of cores increasing. Thus, FPGA platform with highly scalability could solve the problem of core increasing and gets hardware-level performance. Secondly, FPGA modeling is more complex and the time-consuming is much higher than software modeling. With function-timing partition modeling framework and hardware-software co-modeling technique we can effectively reduce modeling work and modeling period. Thirdly, modeling multi-core processor needs abundant FPGA resources. Fine-grained time-multiplexing and software-hardware co-modeling technique can leverage the occupation of FPGA resources.The research and experimental results in this dissertation can be provided to guide the simulation of multi-core processor based on FPGA and take further optimization.
Keywords/Search Tags:chip multi-processor architecture, processor model, functional emulator, performance model, prototype, function-timing partition, time-divisionmultiplexing, software-hardware co-modeling
PDF Full Text Request
Related items