Font Size: a A A

Resource management issues for shared-memory multiprocessors

Posted on:1999-10-06Degree:Ph.DType:Thesis
University:Stanford UniversityCandidate:Verghese, Thukalan (Ben) VergheseFull Text:PDF
GTID:2468390014472418Subject:Computer Science
Abstract/Summary:
Shared-memory multiprocessors are attractive as general-purpose compute servers because the tight coupling of multiple processors, memory, and I/O provides enormous computing power in a single system. This thesis addresses two important performance-related issues encountered when running dynamic, compute-server workloads on these systems.; The first issue is "Performance Isolation". Current shared-memory multiprocessor operating systems provide very few controls for sharing the resources of the system among the active tasks or users. Therefore, the load placed by one user or task can adversely affect the performance of another. We propose "performance isolation", a resource management scheme for multi-user multiprocessor systems that balances the goals of isolation of tasks and sharing of resources. Performance isolation ensures desirable behavior in a heavily-loaded system by guaranteeing a task or user its share of the machine regardless of the load placed on the system by other users. Performance isolation also provides good performance in a lightly-loaded system by carefully re-allocating any extra resources that may be idle in the system to tasks that need them. We implement performance isolation in the operating system and demonstrate that it meets its goals.; The second issue is "Data Locality". For performance and scalability reasons, shared-memory machines are evolving from a bus-based architecture (SMP) to a distributed-memory architecture (CC-NUMA). The latter has multiple nodes, each with one or more processors and a portion of the global shared memory. In CC-NUMA systems, the latency to access memory in another node (remote memory) is significantly greater than the latency to access memory in the local node. Therefore, good data locality is required for good application performance. Static allocation of processors and local memory to processes is not effective for the dynamic workloads found on compute servers. We study the problem of providing efficient migration and replication of pages in the kernel in response to changes in the memory-access patterns of dynamic workloads. Cache-miss counting is used to detect these patterns, and trigger the decision to migrate and replicate pages in the virtual memory system. We show that our OS-based solution provides good data locality and greatly improves application performance.
Keywords/Search Tags:Memory, Performance, Processors, System, Data locality, Provides
Related items