Font Size: a A A

Research On System Level Data Protection Technology

Posted on:2009-04-26Degree:DoctorType:Dissertation
Country:ChinaCandidate:X LiFull Text:PDF
GTID:1118360275470948Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
In current information age, as the explosive growth of data and the data value rising, storage system should take efficient measures to guarantee the reliability, availability and security of users'data. Since storage systems are facing the various challenges of mass data, how to protect data efficiently has become an important task. Because of the complexity of network storage system, it is hard to solve this problem by one single technology. It need take systematic, multi-level and integrated method to research and design the data protection technologies of storage system. These technologies include the technology of continuous data protection, fault tolerant of redundant data, scalable data placement algorithm and self-adaptive data migration. They provide an overall data protection solution for storage system.Firstly, block-level CDP technology is researched and designed from the single storage system node device. According to the merit and shortcoming of existing CDP mechanism, an improved mechanism which based on TRAP (Timely Recovery to Any Point-in-time) is provided. It reserves the data logging method of TRAP first, and then inserts a few snapshot data into the recovery chain by a few interval values. This mechanism can avoid the chain crash problem and shorten recovery time greatly. A quantitative model is used to analyze its performance and calculate an optimal interval value for ST-CDP. The block-level ST-CDP mechanism has been implemented in a prototype system. Experiments results show that ST-CDP does not only have lower storage space overhead and lower infection to the storage system, but also have high recovery efficiency.Secondly, to improve the reliability of redundant data, a fault-tolerant scheme that using erasure coding with small Low-Density Parity-Check (LDPC) code is designed. Since redundant data are important for data recovery, they need fault tolerant scheme to enhance its reliability. By adding this scheme, data can be protected not only in a single storage node, but also can be fault tolerant between nodes. Then an optimal LDPC configuration is schemed out by the analysis of the infection on the data distribution and storage overhead. The scheme is implemented in an iSCSI cluster storage system and performance test results show that such a scheme can provide high reliability for redundant data with a little extra space overhead, whatever in normal or degraded workload conditions.Thirdly, in order to confront the fact that storage nodes'work situation changed frequently because of the node failure, device update in large scale network storage system, an efficient data placement algorithm is developed. This algorithm should perform well for distribution data evenly and minimization data movement when status of storage nodes are changed. A data object placement algorithm based on dynamic interval mapping is proposed to distribute the data objects evenly in storage system. The simulation experiment results demonstrate this algorithm can supports weighted allocation of the storage nodes, and when the scale or the weight of storage system is changed the amount of data movement match well with to the theoretical value.Lastly, a data migration solution is researched and implemented by adopting the feedback control technology in control theory. During the course of data protection and keep load balance for storage system, data migration becomes an important operation step. It means that data migration operation is indispensable when doing data protection, redundant fault tolerant inside of the storage system or doing remote data backup outside of storage system. However, since data migration operation will consume large CPU and memory resources, a self-adaptive data migration scheme is need to eliminate the migration's negative effect to the normal workload. The solution can execute relevant data migration strategies according to different application scenarios and can modify its migration strategies according to current system status dynamically.Thus, from continuous data protection to redundant data fault tolerant scheme, and then from data placement algorithm to data migration method, these technologies are integrated to form an efficient solution for invaluable data protection.
Keywords/Search Tags:Storage system, Reliability, Continuous data protection, Erasure code, Data distribution, Data migration
PDF Full Text Request
Related items