Font Size: a A A

The Research And Application Of Linux Cluster System's Load Balance Mechanism For Data Retrieval

Posted on:2011-05-30Degree:MasterType:Thesis
Country:ChinaCandidate:S CuiFull Text:PDF
GTID:2178360305454904Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
With fast growth of the network business, the nodes, which provide network services, are facing the increasing service requests from users, data flow and computing intensity are increasing constantly, bringing tremendous challenges about the network bandwidth and server. In the future, there will be more and more bottlenecks appear in the server port, it is an emergency that how to build the highly availability, better function and price, scalable network services to meet demand of the growing load. In this case, the load balance technology of cluster based on Linux has emerged.The research topic of this article comes from National Ministry of Science and Technology, the project of Science and Innovation Fund for SMEs, Linux cluster for data retrieval system, the project product is based on the Linux cluster system, with high-performance, highly reliable cluster system for data retrieval software products. Because of it integrates high-availability software, load balance and cluster file system in whole, simplifies the cluster management and convenient application, provides a strong protection for enterprise's important business application. Load balance is a Linux cluster system's key technology, which can expand the network device and server bandwidth, increase throughput and network capacity, and provide a reliable guarantee for the normal operation of high-availability.Firstly, this paper introduces the project's background and relevant technical knowledge and discusses the characteristics of the cluster system as well as its classification. Secondly, the paper introduces some contents about the load balance, through research to the frequently-used load balance strategy, proposing dynamic real-time feedback information, the load balance strategy can predict the load capacity, and give a description of the algorithm. Finally, the paper introduces the realization process of the load balance based on Linux cluster system, and conducts a simulated environment for debugging and running.There is a technique called the cluster technology, that organizes multiple computers to work together to simulate a more powerful computer to solve the problem. A cluster system consists of few servers that have shared data storage, each server communicates with each other through the network, when a server is out of order, its application automatically taken over by the other server. In the most models, all the computers of the cluster have a common name, any running service on the system of the cluster can be used by all the users, presenting a whole system. Load balance can be divided into a static form of load balance and dynamic load balance in accordance with the allocation of the task. In the network environment, when the load balance is receiving task request from user, it will allocate task as much as possible to each server of the cluster according to some particular algorithm, so that maintain the user request amount for each server at a relative balance, but this balance can not take into consideration about the load capacity of the server itself. The method of the dynamic load balance has many advantages than that of the static load balance, dynamic load balance refers to the thing that assign the task to the lighter load of the server node, gives a real-time dynamic record for the load information of server node, so as to avoid a single node overload , so as to the sever, that the members of the cluster server achieve uniform as much as possible, this technology can achieve dynamic allocation , which would take into account the various nodes in the server's actual carrying capacity.This paper discusses the balance strategy of dynamic feedback load information; the algorithm mainly has the following characteristics: First, it gives full consideration to each server node's processing power and the current load conditions. As the cluster system, the performance of the various server nodes may different, so in practice, to consider the server's performance index, allocate high- server with high process ability, while the low-performance configuration of the server with the low process ability. When introducing the server node performance indicator, it sets the value of the dynamic load by real-time monitoring of each server node responding the actual load capacity of this server. Second, the collection, calculations of the node information are put into various nodes, avoiding load balancer work too heavy itself, that become the system bottlenecks. The system transforms the focus dispatch collection work of all the nodes by the central node before, to the work of collection by each node itself, according to its status, it sends to the scheduling center. Thus, this central dispatch node only need to accord to the current node, sending the load information to make dispatch decision, instead of taking collect information from each node machine, such to reduce the additional communication overhead due to the load information collection and the scheduling node burden. Finally, the algorithm introduces the concept of binary sort tree, according to unite the server node performance and real-time load information indicators to calculate the weight so as to generate the binary sort tree, so only need LDR this binary sort tree can arrange the current load condition of server node in order like small to large, load balancers depend on the load information dispatch task. Because of the algorithm introduces the concept of the load redundancy value for each server node to predict the real-time load redundancy ability, avoiding the single server node will be requested excessive task in a short period of time.In the Linux virtual machine environment, we imitate a Linux load balance cluster. Load balancer's main task is to collect the information from the sub-server node at regular time, receive the task from external request, and then allocate the tasks to the sub-server nodes. Server node's main task is that send its system information to the load balancer at regular time, while dealing with the task request from balancer. The establishment of simulation test WEB server, sending request from the user page, exchange data though XMLHTTP Request technology and Apache2 Web server, after the server analyses the type and quantity of the request, using the CGI that written by the Perl script, invoking relevant procedure based on the Socket communication, sending external requests to the load balancer. It uses the HTML and JavaScript scripting languages to write simulate task request user interface, where a page is simulated about user task request, which the user can set the request type ,the number of task request,the IP of the load balancer, and the port number. Linux cluster is built in the simulation system for the program's debugging and running. The system has passed through the test by the Jilin Branch Center of China Software Testing Center and obtained the product test report.
Keywords/Search Tags:Cluster, Load Balance, Real-time Feedback, Load Redundancy, Binary Sort Tree
PDF Full Text Request
Related items