Font Size: a A A

The Analysis And Design Of A Data Backup System Based On HDFS

Posted on:2014-01-07Degree:MasterType:Thesis
Country:ChinaCandidate:W L XuFull Text:PDF
GTID:2248330398972232Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
As a data security policy, data backup is an important means to ensure data security. However, existing data backup system often uses expensive server to store data, which not only greatly increases the cost, but also reduces the system performance. It has causes large number of idle computer equipment resources because of the frequent upgrades of computer hardware. The emergence of cloud storage technology, which is able to make full use of these idle resources, provides a new way to use the data backup. By using these idle resources, Hadoop Distributed File System, a kind of cloud storage software, builds distributed cluster to solve the problem of data storage and security. This paper analyzes the mainstream of backup program to absorb their advantages and make up what they are lack of.This paper applied the technology of cloud storage for data backup, analyzed and designed a data backup system based on the distributed file system HDFS. The system built a low-cost and scalable distributed cluster using the technology of cloud storage, and met the users" demand of data backup/recovery and data archive. It also further improved the performance and security of the system through a combination of compression, encryption and other technologies.This article describes following parts:1. Introduces the background of data backup system. Summed up the mainstream backup systems and their advantages and disadvantages;2. Summary the theoretical knowledge of data backup and HDFS. Introduces the data backup, including its concept, related strategy, classification and structure, as well as cloud storage technology and HDFS relevant knowledge.3. Completes requirements analysis and detailed design of the project. Completes functional requirements analysis, the design of the project overall architecture and functional architecture, the process design of various functional modules and the design of the database.4. According to the design, the construction of the platform has been completed and put into trial operation. as a result, the correctness and reliability of the design is well proved. In addition, evaluating and comparing the performance of the system.5. Summarized the thesis of paper, pointing out its shortcomings and the research direction needed to be improved.Experimental results showed that the system has certain advantages in security, reliability, expansibility and economy. The security of this system was improved through the combination of compression, encryption and other technologies. For reliability, backup data in HDFS cluster are mainly preserved by copies. And when a node fails, it is able to guarantee the normal use of the data. For expansibility, the scale of HDFS cluster can be expanded to enhance backup capacity of the system in the case not of affecting the overall performance. At last, HDFS is the distributed file system designed for cheap hardware, large amount of unoccupied computer resources which can be used to cut down the expenses of procuring devices.
Keywords/Search Tags:Backup System, Data Backup, Data Restore, CloudStorage, HDFS
PDF Full Text Request
Related items