Font Size: a A A

Predictive Algorithm For L2 Cache Misses On Chip Multi-Processors

Posted on:2012-08-02Degree:MasterType:Thesis
Country:ChinaCandidate:F XiaoFull Text:PDF
GTID:2218330362956532Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
A typical CMP (Chip Multi-Processor) architecture often has a shared L2 cache and lower storage hierarchy. Sharing the L2 cache allows high cache utilization and avoids duplicating cache hardware resources. The shared L2 cache can reduce the number of cache misses if the data are commonly shared by several threads, but it can also lead to performance degradation due to resource contention. Therefore it is essential to analyze the behavior of the shared L2 cache in a CMP. We can use the conclusions to optimize our program, rearrange the data access, which can reduce the cache misses greatly.To investigate how a thread's performance varies when it runs together with other threads on different cores, we utilize an analytical model as reference to predict the amount of misses on the shared L2 cache. The model assumes that the parallel threads work on homogeneous tasks and share a fully associative L2 cache. It takes the circular sequence profile as input and estimates the number of misses.The original model assumes that all threads execute homogeneous tasks. If two threads compute heterogeneous tasks, it is not accurate to use the model to predict the number of misses. Therefore we expand and improve the original algorithm. The effective cache size and access frequency are used to predict the number of misses on the shared L2 cache. The improved model not only can predict the number of misses with two threads computing homogeneous tasks, but also can predict the number of misses with two threads computing heterogeneous tasks. The improved algorithm is more accurate than the original one.In addition, in this thesis we analyze SESC simulator, and research on its cache mechanism of simulation and implementation, and then we carry out experiments using SESC simulator and benchmark test programs. The proposed algorithms have been validated by several typical programs. Experimental results show that, compared with the original algorithm, our algorithms improve the accuracy with respect to the prediction of cache misses.
Keywords/Search Tags:CMP, L2 cache, Stack processing, circular sequences
PDF Full Text Request
Related items