Font Size: a A A

Research On The Key Technologies Of Peer-to-Peer Based High Performance Computing For Applications With Data-dependency

Posted on:2009-10-11Degree:DoctorType:Dissertation
Country:ChinaCandidate:F LuoFull Text:PDF
GTID:1118360272972365Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
In order to resolve the compute-intensive applications whose parallism is in the task level, Peer-to-peer (P2P) based high performance computing (HPC) systems have the potential capability to construct a powerful virtual supercomputer far beyond any current computing center by assembling idle internet cycles. Therefore, more and more such systems are coming into being. However, most typical systems currently are limited because they can not support applications with data-dependable relationship. Therefore, in order to promote P2P based HPC systems, it is necessary to investigate the key technologies of such systems, whick drive them to support more general applications with data-dependency.Therefore, a new P2P based HPC platform, P2HP-2, is designed in this paper. The key technologies of P2HP-2 are studied, such as the resource management strategy, the task scheduling policy, as well as the application programming model. Furthermore, combined with an application instance, the overall performance of P2HP-2 is evaluated.To fill the requirement of the resource allocation for the dependent tasks, the computing resources of P2HP-2 are self-organized in GTapestry, which is a structured overlay based on unstructured workgroups. The construction of GTapestry considers the overall proximity matching, including the matching between the object's distribution and the domain's construction, and the location matching between the virtual and physical networks. Moreover, dynamic maintaining metrics for nodes and groups and an object pointer backup policy are exploited to boost the routing efficiency. The structural design of GTapestry provides such characteristic as self-adaptation and self-organization, and the efficient communication mechanism between workgroups helps P2HP-2 to support applications with task-dependent relationship.In order to schedule the execution of the applications with data-dependency, a negotiated coscheduling policy is proposed. According to the dependable relationship reflected in the project description file and the local neighbor tables of the nodes, it dispatches and coschedules dependable tasks with local negotiation. First, the dependable task will be endowed with a task priority, and it will be dispatched to one of the neighbors of the current node with the negotiation among the neighbor nodes. Then the task will be coscheduled with the non-preemptive single-task scheduling, which follows the complete configuration of the necessary programs, parameters and dependent data for the task. At the mean time, according to their current load, the nodes will dynamically adjust the distribution of tasks with local negotiation among them. The programming model attacks the problem at the level of the model that acts as an interface between applications and computing systems, and a one-sided message passing programming model (OMP) is implemented to drive the development of P2HP-2. Based on the communication mode between tasks and the runtime, OMP consists of a communication library (ComLib) and a software development kit (SDK). In the ComLib, a one-sided communication mechanism is provided with the help of the runtime. Then the SDK is implemented based on the ComLib, where the application programming interfaces (API) can be exploited to parallelize applications through module division. Therefore, the tasks can meet their dependent data by initiating a data request with the APIs in OMP, where the request is resolved by the combination of the communication mechanism in GTapestry and the negotiated coscheduling policy of P2HP-2.In order to verify P2HP-2's capability to support the tasks with interactive tasks, a parallel benchmark is presented, where the 1-level-core parallel threading algorithm is proposed and is implemented as the parallel threading program with OMP. In the parallel 1-level-core threading program, the threading process between one protein template and one target sequence is divided as tasks, which are parallel executed and have the static tree-like task-dependent relationship among them.Compared with the structured P2P network as Tapestry, GTapestry is more statble, routes more efficiently, and costs less for dynamic maintenance. It is shown that the structural design of GTapestry can meet the high efficient communication requirements for the dependent tasks. Moreover, according to the theoretical and experimental analysis, it is presented that the negotiated coscheduling policy can schedule the execution of the applications with data-dependency, which can be suitable to the dynamic environment. Finally, in terms of the performance tests of the parallel threading algorithm in P2HP-2, it is shown that such key technologies as the resource management strategy of GTapestry, the negotiated coscheduling policy, and the one-sided message passing programming model can make the P2P based HPC systems to support applications with data-dependency.
Keywords/Search Tags:High performance computing, Peer-to-peer computing, Resource management, Task scheduling, Programming model, Data-dependency
PDF Full Text Request
Related items