Font Size: a A A

Software Defined Network Based Data Deduplication Systems

Posted on:2017-09-12Degree:MasterType:Thesis
Country:ChinaCandidate:Z FengFull Text:PDF
GTID:2348330503489811Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
With the development of big data and mobile internet technologies, data quantity increases dramatically, among them more and more data need backup for protection. To reduce storage cost and network transmission traffic, data deduplication technology is widely used in cloud backup systems to eliminate redundancy. Although data deduplication technology has been relatively mature in storage systems, but in practice, there are still many problems and challenges in the applications of network transmission. For example, in cloud backup system, because of the limited client cache and WAN(Wide Area Network) bandwidth, it is a great delay to go to the server to judge whether the chunk is redundant, therefore blocking the subsequent data uploading, resulting in a large backup window, the problem is more serious in global deduplication. In order to improve the performance of the source deduplication in the cloud backup system, an efficient data deduplication method is of great significance to reduce the backup time.With programmable network technologies such as Software Defined Network mature, deploying complex applications in the network becomes easier. In this paper, we propose to deploy redundancy detection logic in the network, it will speed up the redundancy judgment process, thus reducing the backup time. We cache chunk fingerprints only in the SDN controller, rather than adding cache in every backup client. To some extent, caching fingerprints in every client introduces fingerprints redundancy among the users. The method proposed in this paper is mainly detecting redundancy in-network using Bloom filter and cache index structure in the SDN controller, at the same time, we install specific flow entries to implement querying and updating of the fingerprints cache, in this way, we can quickly determine whether the data is redundant. The contribution of this method is that we introduce deduplication into the SDN, it can not only reduce the client backup window, but also avoid the complex encoding and decoding operation in the middle network device compared with conventional redundancy elimination methods.Experimental results demonstrate that compared to the traditional source deduplication, SDN-based method can greatly reduce backup time and improve system throughput and maintain similar deduplication rate. The performance is more outstanding when the network bandwidth is not that high.
Keywords/Search Tags:Cloud backup, Data deduplication, Backup window, Software Defined Network, Cache
PDF Full Text Request
Related items