Font Size: a A A

Research And Implementation Of Distributed Data Shared Storage System

Posted on:2014-01-21Degree:MasterType:Thesis
Country:ChinaCandidate:L ShenFull Text:PDF
GTID:2248330398450252Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the arrival of big data era, the volume of information is constantly increasing, and data sharing is becoming more and more urgent. Data center in the past has been taken place by cloud storage. Many scholars begin to conduct deep research of sharing data storage. With the expansion of cloud storage, problems of the maintenance and energy consumption have arisen and how to make cloud storage cheaper and more effective is becoming a hot issue. However, the present data storage, whether it is centralized or distributed, has problems of bad scalability, data redundancy and so on. With the further development of cloud computing, those problems will become more and more significant.This article studies further on the present storage system, including traditional disk file system, distributed storage system with central node, storage system with customized deleting function and so on. All those systems are lack of or imperfect in data sharing service, repeated data still occupies a lot of space. This article designs a shared storage system in order to solve problems of data sharing and management of files and chunks in cloud storage. This system uses distributed architecture without central node to delete repeated data in distributed environment, which can provide better load balancing and higher efficiency of deleting repeated data.In order to share repeated data more efficiently, this article designs a two-direction sliding window chunking algorithm. The algorithm has both the high efficiency of Rabin fingerprint and the advantages of parallel computing, data is quickly divided into appropriate chunks and data chunks are used as units for data sharing between files. Memory hash map and Bloom filter are used in the management of files and data chunks in order to reduce the performance loss caused by data chunking. Experimental results show that the system can greatly reduce physical space used by data storage without affecting the performance of storage system too much, physical storage space is reduced by40%while the speed of data storage reduces about10%, therefore the system already has some practical value.
Keywords/Search Tags:Distributed storage, Data Share, Data Chunking
PDF Full Text Request
Related items