Font Size: a A A

Analysis And Optimization Of Replica Consistency For Cassandra Database

Posted on:2018-02-19Degree:MasterType:Thesis
Country:ChinaCandidate:K J MaFull Text:PDF
GTID:2348330512489126Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the continuous development of Internet,various types of applications emerge unceasingly and the amount of data increases rapidly,which is a great challenge to the traditional database storage.The relational database has the following limitations: difficulty in extension,poor performance on reading and writing,high cost and limited capacity.NoSQL is proposed to solve the above mentioned problems.Cassandra is a NoSQL database as well as an open-source distributed database.Its high scalability and high availability make it very suitable for mass data storage.However,owing to its short period of development,there may be some problems in the system.This thesis analyzes the data replication technology and replica consistency solution of Cassandra to find out some problems,and puts forward the optimization program.The system based on database replication technique needs to ensure consistency of replicas.Cassandra uses optimistic replication,and its replica management uses the multi-master state propagation mode.The analysis and experiments show that Read repair and Hinted handoff can not guarantee the consistency of replicas.Firstly,Read repair is performed after the user reads the data,and it will lead to inconsistent result two reads.To solve this problem,we propose a scheme to scan and repair the full amount of data,to ensure that the replicas in the cluster can keep consistent;Secondly,Hinted handoff makes the writing operation more efficient,and the fault node can quickly keep the consistency of the data after recovery.The problem is that it only saves information of Hinted handoff within a certain range of time.To solve this problem,we propose a scheme to check and repair incremental data,to ensure that even beyond the time window of Hinted handoff,the fault node can also quickly keep the consistency of the data after recovery.This mechanism optimizes Hinted handoff,and ensures the consistency of the replicas.Finally,the proposed schemes is implemented and applied to Cassandra.The result of testing shows that the improved system effectively optimizes Read repair and Hinted handoff,and ensure the consistency of the replicas in the cluster.Moreover,in most application scenarios,the system performance loss is not more than 5%.
Keywords/Search Tags:Distributed storage system, NoSQL, Cassandra, replica consistency
PDF Full Text Request
Related items