Font Size: a A A

Optimization Of Replica Recovery In Distributed Databases

Posted on:2021-01-23Degree:MasterType:Thesis
Country:ChinaCandidate:C F ZhuFull Text:PDF
GTID:2428330620968183Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the continuous development of society,a large amount of data is continuously produced.Data is a very useful resource,in order to use the energy in the data,a good database system needs to be designed to store a large amount of data.At the same time,for the high availability of the system,a distributed database system is often used,which can reliably perform load balancing.However,network instability and system failure are inevitable.In order to prevent data loss,it is necessary to design a reliable and efficient data recovery algorithm.In a distributed database system,in order to achieve consistency among replica nodes,a distributed consistency algorithm is often used to synchronize data.This paper uses the Paxos algorithm to replicate logs between replica nodes,and because the Paxos algorithm allows replica nodes to have empty logs when the logs are synchronized.In the process of replica recovery caused by a node failure,consistent recovery can be achieved through local logs and interaction with other replica nodes.The main contributions of this article are as follows:1.In the distributed database system,in the scene of high conflict log entries,this paper designs and optimizes the Redo log entry structure and log file structure.In the process of log replication,by designing the cache queue and the optimized Redo log entries,a part of the log entry indexes that meet the conflict conditions are recorded in the current log entry.When the node enters the data recovery state,you can use these log files to avoid the completion of a part of the hollow log.2.Based on the idea of log filtering,study how to achieve consistent recovery of data in a multi-replica environment.Using log replay technology,the Paxos-based backup node recovery algorithm and master node recovery algorithm are designed.In the process of log replay,the redesigned log entries are used to avoid redoing the redundant log entries,reducing network interactions with other storage nodes,thus reducing the data recovery time of the failed storage node.3.Research on the technology of concurrent recovery of replicas based on the frequency of data access.The main limitation that affects data recovery is the number of I/Os in the system.During the recovery process of replica downtime,most of the time spent concentrated on reading log pages,reading data pages,and placing data pages.By designing a recovery method based on the frequency of data access,during the replay of Redo log files,repeated reads and writes of data pages corresponding to Redo log entries are avoided,reducing the number of disk I/Os,thus improving data Recovery efficiency.At the same time,designing a concurrent recovery strategy to speed up the recovery of the copy.Finally,a multi-copy prototype system Paxos-replication is designed and implemented in this paper,and the above optimization schemes are implemented.The related experiments are performed to verify the effectiveness of the recovery method.when a certain copy of the distributed database is restored based on local logs only,an optimization scheme is implemented in the disk database prototype system DB_SELT,and experiments are performed to verify it.
Keywords/Search Tags:Consistency recovery, high conflict log entries, log replication, log filtering, replica recovery
PDF Full Text Request
Related items