Font Size: a A A

Research On Key Technologies Of Memory System For Synchronous Data Triggered Architecture Based Multicore Processor

Posted on:2009-12-22Degree:DoctorType:Dissertation
Country:ChinaCandidate:J J GuoFull Text:PDF
GTID:1118360278456582Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The multicore architecture enhances the processor performance while there are many cores accessing the memory which need more memory bandwidth and deteriorate the'memory wall'problem. In this dissertation, we detailed analyzed the characteristics of the typical applications under Synchronous Data Triggered Architecture (SDTA). Based on the analysis, this dissertation focuses on the key technologies of memory system for the SDTA-based multicore processor and details our researches on multicore processor memory hierarchy model, P2P data presending technology and synchronization technologies for the multicore processor.The primary innovative works of this dissertation can be summarized as follows:(I) A'producer-consumer'-based evaluation model for the multicore processor memory hierarchy is proposed. Multicore processor performances with different memory sharing levels are compared which help to determine to share the memory at the L2 cache level. Based on the'producer-consumer'property of the data access request handling, a memory hierarchy model is built using the queue theory to analyze the configuration of memory hierarchy and to guide the optimization of memory hierarchy. The model is used to evaluate the impact of different configuration parameters on processor performance. The preliminary performance estimation can be used to adjust the memory hierarchy.(II) The P2P data presending technology for the multicore processor is proposed and the data transfer engine is designed to support the P2P data presending. In order to optimize the'one-to-many'data consuming relationship in multicore programming, the P2P data presending technology is researched. The P2P collaborative communication model is proposed and the data transfer engine is designed to support the P2P data presending technology. Theoretical analysis and practical tests all show that the P2P data presending technology enhances the multicore processor performance effectively.(III) Two synchronization technologies are proposed for the multicore processor. (a) The synchronization technology based on the synchronization memory is proposed and the synchronization unit and synchronization controlling unit are designed to support the synchronization technology. The synchronization unit is easily to add into the SDTA-based computing core. A special synchronization path is provided without disturbing the normal memory access which helps to alleviate the resource contentions and cache consistency problems caused in the conventional synchronization process under the cache sharing structure. Experimental results show that the proposed synchronization technology outperforms the conventional synchronization technologies under the cache sharing structure. (b) In the synchronization technology based on the synchronization memory, many processor cores access the synchronization memory which cause the contention and the synchronization latency increases too. Therefore, the synchronization technology based on invalidation of instruction cache is proposed. It invalidates the instruction cacheline to be executed which causes miss in the instruction cache after which an instruction fetch request is issued to the L2 cache. In the L2 cache, there is a filter to filter the instruction fetch requests from processor cores that do not meet synchronization requirements. This synchronization technology uses the original memory access path to accomplish synchronization. Experimental results show that this synchronization technology is more scalable and its performance is comparable to the performance of the synchronization technology based on the synchronization memory.(IV) An instruction prefetch strategy is proposed which suits the characteristics of the SDTA instruction set and the instruction cache supporting this prefetch strategy is designed. Utilizing the memory level parallelism of multicore processor memory accessing, a replacement strategy for the L2 cache is proposed to optimize the whole execution time. The L2 cache supporting the synchronization technology based on invalidation of instruction cache is designed. A configurable data cache is designed which can be configured partially into the data cache and partially into the scratchpad memory to provide appropriate memory structure according to the data accessing characteristics.
Keywords/Search Tags:multicore processor, synchronous data triggered architecture (SDTA), memory system, memory hierarchy, evaluation model, data presend, synchronization for multicore processor, design and optimization
PDF Full Text Request
Related items