Font Size: a A A

Research And Implememtation Of Memory Access Optimization In NUMA Environments

Posted on:2019-06-29Degree:MasterType:Thesis
Country:ChinaCandidate:K M RuanFull Text:PDF
GTID:2428330596463175Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the advent of cloud computing and in-memory computing,applications have become more and more demanding on the performance of mu lti-core architecture servers,and they also pose challenges for the high scalability of servers.Therefore,the NUMA architecture replaces the SMP architecture with the advantages of good scalability and low latency of local access and becomes a common ar chitecture for servers.Meanwhile,multi-threaded programs could make full use of the CPU resources of the servers,improve system resource utilization.Multi-threaded programs that run on NUMA architectures often suffer from degraded program performance due to memory allocation and access issues.In order to improve the performance of multi-threaded applications on NUMA architecture,this paper proposes the design of a memory analyzer based on Intel's dynamic binary analysis framework Pin for NUMA architectures,which is used to analyze the memory sharing behaviors between threads and objects,and how threads access objects at runtime.First,this paper develops a Pintool to analyze the multi-threaded application under the NUMA architecture,helping programmers understand the memory access behavior of the application and find the performance bottleneck of the application.Performance bottlenecks are mainly remote memory accesses of threads and objects.Second,we simulate the MOESI cache coherence protocol,optimize the layout of threads and objects in the application under the guide of identified bottlenecks,and analyze the application cache efficiency by simulating a Three-Levels Cache under the NUMA architecture.Optimizing the application's cache usage can also effectively improve the performance.Finally,the memory analysis tool designed for this article uses unique multi-threaded programs for testing,and dynamically optimizes and re-lays threads and dynamic objects for guidance information to optimize the program.Experimental evaluations show that the fetch analysis feature can effectively improve application performance and increase cache utilization.
Keywords/Search Tags:NUMA, Memory Analysis, Dynamic Binary Instrumentation, Layout Optimization, Cache Coherence
PDF Full Text Request
Related items