Font Size: a A A

The Research On Mechanisms Of Optimizing Memory Access In Multi/Many-Core Architecture

Posted on:2017-04-08Degree:MasterType:Thesis
Country:ChinaCandidate:S NiFull Text:PDF
GTID:2348330503989863Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
To meet the data demand of the significantly increased processor cores, modern multi/many core processors are designed with high bandwidth data channels. The efficiency of the bandwidth usage is crucial for program performance, and it is majorly determined by program data's consistency maintenance. Thus, precisely analyzing the coherence cost becomes crucial for designing efficient parallel programs. Unfortunately, the coherence cost is jointly affected by many factors, such as parallelism, data sharing, read/write ratios. This makes coherence cost analysis beyond the capability of most programmers, and calls for a model for covering these complexities.An evaluation model is proposed for analyzing data access performance on mu lti/many core processors. The model is built upon cache coherence protocol analysi s. It provides dynamic cache state transition analysis and data sharing cost evaluati ons. The rich information derived by the model can be further used as guidance d uring program optimization.The effectiveness of the model is proved during a process of real application optimization. Bloom Filter is selected for this illustration, since it is a widely used tool, and it is a preventative application with high degree of parallelism and data sharing. Base on the model evaluation, we optimize the data access patterns and data sharing pattern of the classic bloom filter design, and get a more data-accessefficient Parallel Bloom Filter(PBF) for multi/many core processors.Results show that the evaluation model can predict the memory access perfor mance with only 7% deviation. And PBF increases the memory access performance by 3X when compared with classic counting bloom filter. It also improves scalabi lity. The speedup ratio can reach a maximum of 80.7x with 177 threads.
Keywords/Search Tags:Multi-Core Processor, Many-Core Processor, Cache Coherence Protocol, Bloom Filter, Method for Estimating Memory Performance
PDF Full Text Request
Related items