Font Size: a A A

The Optimization Design And Implementation Of Secondary Cache Prefetching On YHFT-XDSP

Posted on:2018-01-21Degree:MasterType:Thesis
Country:ChinaCandidate:J X XiaFull Text:PDF
GTID:2428330569998555Subject:Electronic Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of integrated circuit design and process technology in recent years,the performance of CPU has increased by 60% every year and that of the memory has increased by 7% every year.The “memory wall” problem brought by the huge performance gaps between CPU and memory has become the bottleneck of improving the overall performance of the microprocessor.The multilevel cache memory structure is an effective way to alleviate this problem.The optimization design for caches has become an important research direction to improve the memory efficiency of microprocessors.Based on the requirements of a multi-core DSP chip—YHFT-XDSP developed independently,this paper studies its L2 cache prefetching optimization design and implement,which has a great significance to improve the performance of YHFT-XDSP.If instructions or data fetching fails in L2 cache,the cores need to access external memory or on-chip shared SRAM,which would increase the access latency greatly.In order to reduce the latency of the instructions and data access and improve the performance of DSP core,the optimization design of L2 prefetching part is realized in this paper.The main content is as follows:First,the program and data prefetching optimization schemes of L2 is proposed based on YHFT-XDSP memory architecture.According to the idea of branch prediction,the program invalidation information of L2 is tracked.The method based on two-bit saturation counter is adopted to implement the optimization of program prefetching.Based on the local data access information,a prefetch history structure table is designed to trace the data failure information of L2.And a self-adaptive data prefetching method is proposed to support the combination of fixed step size and linear change step size.At the same time,based on the confidence mechanism,the possibility of continuous prefetching is analyzed and the number of data prefetching is changed to achieve data prefetching optimization.Pipelining is used for the prefetching requests to improve its efficiency.The pipeline structure can deal with the failure requests continuously,which can hide the hitting time of the failure requests and reduce the failure cost of the superior requests.Finally,the function of prefetching part of L2 is verified at module level based on simulation verification method.Code coverage is close to 100%.Based on the same constrain conditions of 0.66 ns clock cycle and 40 nm CMOS technology library,logic synthesis has been realized for the prefetching part.The hardware resources have increased 15.68%.The final system-level performance evaluation shows that the average hit rate of the program prefetching and data prefetching is increased by 0.67%and 10.35% respectively.The performance of the system has been improved effectively with a small amount increase of hardware cost.
Keywords/Search Tags:L2, prefetching, history information, confidence mechanism, self-adaptive step size
PDF Full Text Request
Related items