Resource management issues for shared-memory multiprocessors

Posted on:1999-10-06

Degree:Ph.D

Type:Thesis

University:Stanford University

Candidate:Verghese, Thukalan (Ben) Verghese

Full Text:PDF

GTID:2468390014472418

Subject:Computer Science

Abstract/Summary:

Shared-memory multiprocessors are attractive as general-purpose compute servers because the tight coupling of multiple processors, memory, and I/O provides enormous computing power in a single system. This thesis addresses two important performance-related issues encountered when running dynamic, compute-server workloads on these systems.; The first issue is "Performance Isolation". Current shared-memory multiprocessor operating systems provide very few controls for sharing the resources of the system among the active tasks or users. Therefore, the load placed by one user or task can adversely affect the performance of another. We propose "performance isolation", a resource management scheme for multi-user multiprocessor systems that balances the goals of isolation of tasks and sharing of resources. Performance isolation ensures desirable behavior in a heavily-loaded system by guaranteeing a task or user its share of the machine regardless of the load placed on the system by other users. Performance isolation also provides good performance in a lightly-loaded system by carefully re-allocating any extra resources that may be idle in the system to tasks that need them. We implement performance isolation in the operating system and demonstrate that it meets its goals.; The second issue is "Data Locality". For performance and scalability reasons, shared-memory machines are evolving from a bus-based architecture (SMP) to a distributed-memory architecture (CC-NUMA). The latter has multiple nodes, each with one or more processors and a portion of the global shared memory. In CC-NUMA systems, the latency to access memory in another node (remote memory) is significantly greater than the latency to access memory in the local node. Therefore, good data locality is required for good application performance. Static allocation of processors and local memory to processes is not effective for the dynamic workloads found on compute servers. We study the problem of providing efficient migration and replication of pages in the kernel in response to changes in the memory-access patterns of dynamic workloads. Cache-miss counting is used to detect these patterns, and trigger the decision to migrate and replicate pages in the virtual memory system. We show that our OS-based solution provides good data locality and greatly improves application performance.

Keywords/Search Tags:

Memory, Performance, Processors, System, Data locality, Provides

Related items

1	Scalable Compiler Optimizations for Improving the Memory System Performance in Multi- and Many-core Processors
2	Memory and control organizations of stream processors
3	Research On Endurance For Phase Change Memory Based On Data Value Locality
4	Locality Transformations and Prediction Techniques for Optimizing Multicore Memory Performance
5	Increasing memory performance in multi-core processors
6	Intelligent memory manager: Towards improving the locality behavior of allocation-intensive applications
7	Exploring, defining, and exploiting recent store value locality
8	Software and hardware methods for memory access latency reduction on ILP processors
9	Speculative distributed shared-memory multiprocessors organized as processor-and-memory hierarchies
10	Model Driven Cache Management