Font Size: a A A

A comparison between virtual code management techniques

Posted on:2012-02-19Degree:Ph.DType:Thesis
University:University of DelawareCandidate:Manzano, Joseph BFull Text:PDF
GTID:2468390011963108Subject:Engineering
Abstract/Summary:
During the past decade (2000 to 2010) , the multi / many core architectures have seen a renaissance, due to the insatiable appetite for performance. Limits on applications and hardware technologies have put a stop to the frequency race in 2005. Current designs can be divided into homogeneous and heterogeneous ones. Homogeneous designs are the easiest to use since most toolchain components and system software do not need too much of a rewrite. On the other end of the spectrum, there are the heterogeneous designs. These designs offer tremendous computational raw power, but at the cost of losing hardware features that might be necessary or even essential for certain types of system software and programming languages. An example of this architectural design is the Cell B.E. processor which exhibits both a heavy core and a group of simple cores designed to be its computational engine.;Recently, this architecture has been placed in the public eye thanks to being a central component into one of the fastest super computers in the world. Moreover, it is the main processing unit of the Sony's Playstation 3 videogame console; the most powerful video console currently in the market. Even though this architecture is very well known for its accomplishments, it is also well known for its very low programmability. Due to this lack of programmability, most of its system software efforts are dedicated to increase this feature. Among the most famous ones are ALF, DaCS, CellSs, the single source XL compiler, the IBM's octopiler, among others. Most of these frameworks have been designed to support (directly or indirectly) high level parallel programming languages. Among them, there is an effort called Open OPELL from the University of Delaware. This toolchain / framework tries to bring the OpenMP parallel programming model (De facto shared memory parallel programming paradigm) to the Cell B.E. architecture. The OPELL framework is composed of four components: a single source toolchain, a very light SPU kernel, a software cache and a partition / code overlay manager. This extra layer increases the system's programmability;but it also increased the runtime system's overhead. To reduce the overhead, each of the components can be further optimized. This thesis concentrates on optimizing the partition manager components by reducing the number of long latency transactions (DMA transfers) that it produces. The contributions of this thesis are as following: (1) The development of a dynamic framework that loads and manages partitions across function calls. In this manner, the restrictive memory problem can be alleviated and the range of applications that can be run on the co-processing unit is expanded. (2) The implementation of replacement policies that are useful to reduce the number of DMA transfers across partitions. Such replacement policies aim to optimize the most costly operations in the proposed framework. Such replacements can be of the form of buffer divisions, rules about eviction and loading, etc. (3) A quantification of such replacement policies given a selected set of applications and a report of the overhead of such policies. Although several policies can be given, a quantitative study is necessary to analyze which policy is best in which application since the code can have different behaviors. (4) An API that can be easily ported and extended to several types of architectures. The problem of restricted space is not going away. The new trend seems to favor an increasing number of cores (with local memories) instead of more hardware features and heavy system software. This means that frameworks like the one proposed in this thesis will become more and more important as the wave of multi / many core continues its ascent. (5) A productivity study that tries to define the elusive concept of productivity with a set of metrics and the introduction of expertise as weighting functions.;Finally, the whole framework can be adapted to support task based frameworks, by using the partition space as a task buffer and loading the code on demand with minimal user interaction. This type of tasks are called Dynamic Code Enclaves or DyCE.
Keywords/Search Tags:Code, System software
Related items