Design Of Parallel Systems For Stateful Applications On Clusters

Posted on:2008-07-19

Degree:Doctor

Type:Dissertation

Country:China

Candidate:Y Chen

Full Text:PDF

GTID:1118360215476770

Subject:Circuits and Systems

Abstract/Summary:

PDF Full Text Request

In the past ten years, people see unprecedented development of cluster systems. The reason for this wide deployment is due to the cluster's nice features, like simplicity, high availability, scalability and high performance over cost rate and etc. These features make clusters exceptionally well suited to construct highly available, high performance and scalable information systems in computer and telecom networks. Of these, one of the most typical examples is stateful systems. These stateful systems have several characters in common: 1)They need to maintain huge amounts of data. 2)The data in the system are arranged in units, independent of each other. 3)Typical operations in the system involves only one unit. 4)They need to support a large amount of concurrent operations.Though clusters having so much advantage, it is generally difficult for people to design such systems, and they are challenged with many problems. In general, it is an art for people to propose solutions for these problems, with each solution meeting service requirements and not conflicting with one another. The goal of this thesis is to alleviate such difficulties.For stateful applications, this paper investigates through building math models what factor may affect the system performance and to what degree they do, when using Round-Robin scheduling strategy. This paper finds and proves in theory: 1)system performance, i.e., the utilization of each AS and the stretch factor of average response time, mainly depends on the ratio of call arrival rate and service rate, while hardly effected by the numbers of AS and sources; 2)when the system dealing with fast sources, the average throughput of each AS will drop significantly, while the average response time increases slightly; in particular, the throughput per node can only reach 70% of its maximum, and the average response time increases by no more than 2 times of its service time; 3)when the system dealing with slow sources, the average throughput of each AS will instead drop slightly, while the average response time increases greatly; in particular, the throughput per node can reach over 90% of its maximum, and the average response time increases by more than 9 times of its service time.This paper compares three common load balancing strategies, one called Minimal, one Weighted, and the last Random. The results are: 1)Weighted can well coordinate the decisions by each node, whose performance is better than Minimal; 2)Random can exhibit better performance when the load information for backend nodes quite goes out of date in frontend nodes.For massive storage systems like Google File System, this paper investigates the factors that affect System MTTF, when the number of nodes goes to hundreds and thousands. The work finds and proves theoretically: 1)For systems like Google File System, System MTTF is only about 1 hour, when the replica number is 2; and System MTTF goes to more than a dozen of days, even tens of days, when the replica number is 3. 2)In case of node failure, the transition rate of data blocks between nodes can impact System MTTF significantly, the larger the ratio, the longer System MTTF. 3)The distribution randomicity of data blocks is crucial to System MTTF. Systems with poor randomicity will have a short System MTTF.This paper also studies the effects of other factors to System MTTF. The results include: 1)For systems configured like Google File System, System MTTF can be above 10 days, if the node MTTR is two days; and System MTTF goes over 4 days, if the node MTTR is 4 days. 2)System administrator can put all the failed node together to work every 2 or 3 days to improve work efficiency, if System MTTF being 4 days is sufficient. 3)The restart strategy cannot offer better System MTTF than the copy strategy when the number of nodes is large.With respect to parallel programming model, this paper investigates aspects that effect system performance in stage-based parallel programming model. This paper finds through a number of simulations that the system performance will drop significantly, even to an unacceptable level, if the design is bad and parameters are set inappropriately. From these, three design rules are obtained. 1)The number of stages on each processor must be appropriately set, usually below 10. 2)For stages with blocking call, the number of concurrent threads can be set as the product of call arrival rate and the average blocking time. 3)The tasks with temporary responses should be put in a stage without blocking call in order to send the responses back in time. This paper also applies these rules to the design of a SIP proxy server, which proves our findings.With respect to fault-tolerant design, this paper gives a new set of layered architecture to help designers give clear fault-tolerant designs. The concepts in this architecture include: the Unit design, the System design, the Fault-tolerant Unit design and the Fault-tolerant System design. When applying these concepts into the analysis of 4 typical systems, this paper finds that these concepts can separate fault-tolerant design from the overall design clearly, giving ease for better analysis. Meanwhile, the procedure used in analysis is also valuable for designers when dealing with their systems.From many fault-tolerant designs, four common design rules are summarized. They are: node failure detection, failure processing, bookkeeping and caching, and retrying. 1)Node failure detection is that to assure that a message r1 is successfully received, the sender node must wait for the response of message r1. 2)Failure processing is that failure notification shall be dealt with as messages in original design as much as possible. 3)Bookkeeping and caching is that operations and states shall be recorded or cached, to ensure data consistency among replicas. 4)Retrying is that failed transactions shall be retried to make it complete. Through the analysis of 4 typical systems, this paper finds that these rules are used frequently, with fine techniques.To help designers to choose appropriate solutions when designing stateful applications for clusters, this paper classifies and summarizes the common problems in the design of stateful applications for clusters, gives common solutions for each problem, and describes the characteristics of each solution. These results are presented in tables, for the ease of users.

Keywords/Search Tags:

stateful application, parallel system, cluster, high availability, high performance, scalability

PDF Full Text Request

Related items

1	The Study Of High Availability Of Network Servers Cluster System And Management Software Realization
2	Availability Management Software For Multi-node Cluster System Design And Implementation,
3	Cluster Operating System High Availability Services Research
4	High-availability Server Aggregation System Analysis And Design Method
5	Design And Implementation Of Cluster Industry Application GateWay
6	Design And Implementation Of Cluster Industry Application Gateway
7	Research And Application Of High-performance Pc Cluster
8	Research And Application Of High-availability Database Cluster System
9	Portable high performance and scalability of partitioned global address space languages
10	A High Availability Cluster Solution Base On GPFS Parallel File System