Font Size: a A A

Modeling Communication Overhead Of ARM Multi-Core Processors On Android Applications

Posted on:2017-09-12Degree:MasterType:Thesis
Country:ChinaCandidate:G X HeFull Text:PDF
GTID:2348330491463964Subject:Integrated circuit engineering
Abstract/Summary:PDF Full Text Request
In recent years, multi-core processors have been widely used on the mobile intelligent terminals. Concurrent working of multi-core can reduce program execution time. However, multi-core also induces inter-core communication overhead which hinders system performance's further improvement. Studies have shown that cache coherence miss is a key factor that affects the communication overhead. The number of coherence misses can be obtained from full-system simulations, while the whole process is extremely time comsuing. This thesis aims to develop a model that can fast predict coherent misses of private L1 Caches on out-of-order multi-core processors with acceptable accuracies.This thesis combine accessing memory stack distance histogram with write-invalid information to forcast the number of cache coherence misses. It is indeed an effective method to model coherence misses on private LRU-Cache of in-order processors. According to experiments in this thesis, however, this method can not be applied to Out-Of-Order processors directly. This is because features of out-of-order processors, such as out-of-order excution, load in store queue and non-blocking issues, make changes to the stack distance distribution that collected in program order. Therefore, this thesis proposes an ANN(artificial neural network) model, named Uniform, to solve the above impacts. The input of the uniform model is the stack distance histogram with write-invalid information, the output of the Uniform model is the number of cache coherence misses. In the case of unchanging hardware architecture, the ANN model can predict the number of coherence miss across Benchmarks.To assess the accuracy of the model, this thesis choose Mobybench2.0 and Parsec3.0 Benchmark suite. The error of using full-system simulations to obtain the number of cache coherence misses is less than 1%. Compared with results from full-system simulations, the maxmum error of the Uniform model is less than 9%. Compared with full-system simulations, our ANN model can bring over 56.8% average time reduction. Meanwhile, when using the trained ANN model to forcast other 3 Benchmarks, the model can further decrease about 82% time than full-system simulations.
Keywords/Search Tags:inter-core communication overhead, coherent misses, Out-Of-Order processors, Non-blocking Cache, ANNs
PDF Full Text Request
Related items