Font Size: a A A

Research And Optimization Of The Reliability Of Redis Cluster

Posted on:2018-09-18Degree:MasterType:Thesis
Country:ChinaCandidate:Y LiFull Text:PDF
GTID:2348330515496441Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
In-memory database has developed very fast in recent years and the most outstand-ing one is Redis.Redis is a key-value based in-memory database which has features of redundancy and replication backup to guarantee the safety of database.And it supports different kinds of programming language to improve its compatibility.Usually Redis provide data storage services in form of a high fault tolerance,high available,scalable,distributed cluster.The features mentioned before promote the development of Redis in different areas,such as cache system,cloud computing,big-data processing etc.As the scale of redis cluster keep growing,its reliability issue begin to appear.Improve-ment on the reliability can reduce the lost in case of malfunction.This article focus on its fault-tolerance system to propose several optimization method to improve its reliability.The main work of this article is as follows.(1)For the single point problem in Redis cluster,based on the main process of the fault tolerance system of Redis cluster,propose a method to divide fault tolerance system into stages.The fault tolerance system guarantee the recovery when single point failure happens.During the main process of fault tolerance,cluster state changed through:failed and recovered.Based on the change of cluster state,this article divide the fault tolerance system of Redis cluster into two stages:stage of fail detection and stage of failover.Failed node is detected in stage of fail detection and the cluster is recovered in stage of failover.(2)For the convergence problem of stage of fail detection,based on the features of gossip communication process,this article proposes a communication model for large scale cluster.And build the relation between the convergence speed of fail detection and the communication process,find that the frequency of message sending is low because of the unbalanced communication process and when adding redundant message into heartbeat message,the node is randomly chose which reduce the proportion of useful information.For the bottleneck of fail detection,this article proposed a optimization solution to improve the communication process based on communication load balancing and to speed up the convergence process so to improve the effectiveness of fail detection.(3)For the number tolerance of the fault tolerance system,based on the shard struc-ture of Redis cluster,propose new election algorithm.The original election algorithm limit the function of master node cause the multi-election problem of slave node.And slave node does not contribute to election process which limit the applicable scope of the algorithm.This article implements a new election method as inter-election based on the method of group coordination.With the optimized election algorithm,the tolerance number of failed master node is improved.In order to make contrast test of the reliability of Redis cluster before and after the optimization,we proposed a structure of to simulator multi-machine large-scale Redis cluster based on docker container.Based on the test result we get,the improvement of the fault tolerance of Redis cluster is up to 28 percent,and the improvement on stage of FAIL is up to 80 percent.After we use the optimized election algorithm,the tolerance of master node failure number is doubled.More importantly,the optimization method has minimal influence on the communication load,operation delay throughput of Redis cluster.
Keywords/Search Tags:Redis Cluster, Reliability, Fault Tolerance Mechanism, Communication Load Balancing, Node Election
PDF Full Text Request
Related items