Font Size: a A A

Multi-bank Data Promotion With Bank Coherence Technique For NUCA On CMP

Posted on:2013-11-01Degree:MasterType:Thesis
Country:ChinaCandidate:D WangFull Text:PDF
GTID:2248330371984014Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
At present, the computer has become an indispensable tool in people’s life and work. In use, higher and higher demands on the computer are wanted, a higher processing speed, greater storage capacity, more convenient and friendly usage and so on. In order to improve the speed of the processor, manufacturers constantly improve the frequency of the processor, but the ensuing greater power consumption becomes the bottleneck of the processor speed. In this case, the Chip Multi-Processor (CMP) comes into being. It integrates more computing cores on one processor chip to increase computing power. CMP has become the mainstream of the market. CMP processing chip research has also become necessary.At the same time, the integrated circuit manufacturing process has rapidly developed, the increasing on-chip cache capacity makes the cache size larger and larger, and the ware delay of large on-chip cache becomes longer. Because of the longer ware delay has a great impact on the processing speed of CPU, some people propose that the non-uniform cache architecture (NUCA), which allows the cache bank has different access latency, has smaller average access latency than the previous uniform cache architecture (UCA).To reduce the processor access latency, the dynamic NUCA structure support the migration of data blocks, the data which be hit by the processor will move to the closer bank to this processor, thus reducing the latency when the same CPU re-access the data next time. The data movement in cache is called data promotion or blocks migration. Due to the limit of the replacement policy, in the process of the data promotion, it may move the more useful data in the target bank into the further bank, which, in turn, increases the total latency of the system increases.While improving the data promotion technology based on the CMP structure, we should also consider an important issue about the shared data. Multiple cores on a chip share one L2or L3cache, there will be simultaneous accesses to a shared data. The aim of data promotion technology is to move to the closer bank to this processor, thus reducing the latency when the same CPU re-access the data next time. In addition, sometimes, several CPUs access a shared data simultaneously. It may lead to the shared data stays in the middle of NUCA, as a result, the access latency of the system does not significantly reduce, which restricts the advantage of data promotion.Therefore, the improvement of technology combines with the bank coherence technique to allow the shared data has some copies in NUCA, each copy belonging to different CPUs. Then the bank coherence technique maintains the coherence of the copies in different banks, so as to solve the problems brought by the data competition, and to improve the speed of the CPU access to shared data.Maintain data coherence need to record the different status of data, in this paper, the technology dynamically selects the target bank, where the data will move to, based on cache line status that whether it is empty, or whether the internal data is valid or dirty. Consequently, this paper proposes a data promotion technology for non-uniform cache architecture on CMP, it using the existing bank coherence technique to solve the problem that the shared data competition hinders the reduction of delay.Firstly, this paper gives a simple introduction to the research background, related technology as well as several simulation tools in system architecture research respect, and gives a detailed introduction for the Simics simulation tool. And then, existing fixed step data promotion technology and its problems are introduced. Besides, the bank coherence technique, which is used in this paper, is explained. What’s more, there is a detailed description of the data promotion technology for non-uniform cache architecture on CMP, with the data coherence technique.Finally, using the full system simulation and the NAS Parallel Benchmark (NPB), this technology has been tested, gaining the desired results. The technology can effectively reduce the processors access latency for the shared cache. Compared with the design proposed by Kim, C, et al. IPC growth rate average increases8.19%, the number of promotion operation reduces, and it improves the system performance.
Keywords/Search Tags:Multi-core, NUCA, data promotion, bank coherence, Simics
PDF Full Text Request
Related items