Font Size: a A A

Nonblocking Algorithm And Optimization On Multi-process Network Programming

Posted on:2014-02-05Degree:DoctorType:Dissertation
Country:ChinaCandidate:J Z PengFull Text:PDF
GTID:1108330482479005Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
In multi-core processors and multi-queue NIC environment, the optimization of scalability and performance on the network protocol stack and multi-process program becomes a research hotspot. Currently, the multi-process network program has three main problems:the low scalability on multi processes caused by locking of shared data structures, the low Cache efficiency and low scalability on the network protocol stack caused by lack of close coordination between the network protocol stack and processes, the difficult session remain.Due to the above problems, the multi-process network program cannot play the advantages of multi-core processors multi-queue NIC platform. Even because the incorrect session remain, multi-process network program can only be run in single-process mode.This paper finds optimization methods from non-blocking programming, OS load balancing, network protocol stack and NIC driver. The main contribution of this paper and innovation as follows:For the low scalability on multi processes caused by locking of shared data structures, this paper implements MWrite which is a non-blocking programming model and the lock-free array hash table. MWrite is easy to use, has high scalability, especially suitable for high read ratio situation.The doubly linked list experiment shows that MWrite has average performance of 72.65% improvement compared with the MCAS. MWrite makes writing highly scalable non-blocking data structures easy. In the case of multi-process concurrent access, the scalability of the lock-free array hash table is close to 100%. Its access speed is 10 times faster than the red-black tree and read-write lock protected traditional hash table. In the situation of large scale of data(>= 1W), it may conflict and make the data cannot be stored in. But the theory and experiments show that the probability of conflict is very low. The lock-free array hash table is easy to implement and solve the problem of locking in multi-process web Cache.For network program,the process number of witch greatly exceed the number of processor cores, this article proposes a Linux operating system load balancing optimization method to improve the task throughput of the system. Linux uses the scheduling domain load balancing algorithm; it tries to assign the new process to the idlest CPU of the idlest core, and if the first CPU of a core is comparatively idle, it tries to pull moderate amount of tasks from the busiest CPU of the core to balance the system workload periodically. Under certain circumstances, this strategy would cause the system to be more unbalanced. In this paper’s optimization, the new process is assigned to the idlest CPU of the entire system, and the idlest CPU of a core can move tasks from the core to itself periodically. Experiments show that this method can improve in the best case 8% of the system performance.For network program,the process number of witch greatly exceed the number of processor cores, this article proposes a Linux operating system load balancing optimization method to improve the task throughput of the system.Linux uses the scheduling domain load balancing algorithm; it tries to assign the new process to the idlest CPU of the idlest core, and if the first CPU of a core is comparatively idle, it tries to pull moderate amount of tasks from the busiest CPU of the core to balance the system workload periodically. Under certain circumstances, this strategy would cause the system to be more unbalanced. In this paper’s optimization, the new process is assigned to the idlest CPU of the entire system, and the idlest CPU of a core can move tasks from the core to itself periodically.For the low Cache efficiency and low scalability on the network protocol stack caused by lack of close coordination between the network protocol stack and processes,this paper presents a multi-process server model optimization method and a multi-process proxy server model optimization method to improve the efficiency of the system Cache and enhance the scalability of the network protocol stack, and each process can complete session remain separately. These optimization methods make network hardware interrupt, soft interrupt, receiving packets and sending packages of the same connection (or proxy server connection) in the same network card queue and use the same processor and the same process to handle. The packets of the same source IP are routed to the same process to ensure the session maintain. These optimizations can improve the Cache efficiency and enhance the scalability of the network protocol stack. In three tests, it makes the system performance increased by 25%,53% and 46%, respectively.Finally, applying the idea of optimization above, this paper implements a efficiency and high scalability network load balancing framework-UVS, which is based on user-mode packet processing. UVS uses zero-copy technology to send and receive data packets in user mode and implements a TCP/UDP reverse proxy. UVS uses an array-based lock-free queue to share packets between processes, and uses multi-process server model optimization method to improve the efficiency and scalability of the system Cache. Experiments show that its performance is superior to LVS.
Keywords/Search Tags:Multi-core, multi-process, lock-free programming, nonblocking programming, multi-queue NIC, load balancing, zero-copy
PDF Full Text Request
Related items