Font Size: a A A

Research And Implementation Of Parallel Migratory Compression Algorithm

Posted on:2017-02-17Degree:MasterType:Thesis
Country:ChinaCandidate:S P ZhouFull Text:PDF
GTID:2348330503989808Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the explosive growth of data scale, data deduplication and data compression, as two important technologies of space efficient approaches, have attracted more attention. Compared with data deduplication, data compressioncan eliminate more redundancy. Data compression usually finds redundancywithin a certain range, so lookback windowis introduced. If two similar chunks are beyond this range, data compression can hardly eliminate such redundancy.In migratory compression, similar data chunks will be re-located together so that traditionalcompressors can have better chance to eliminate more redundancy, which improves compression factors.However, the MC algorithm is implemented to be serialized that deals with the similar data within a file, not to work among files, which is the bottleneck of the compressing performance of mass storage system and network transfer.To solve this problem, a parallel migratory compression algorithm is proposed.On one hand the algorithm applies the parallelized and pipelined design to reduce the time expense of compute-intensive modules such as chunking, deduplication and similarity detection. On the other hand it applies Asymmetric Extremumchunking algorithm to accelerate the chunking process, and uses data deduplicationto improve the efficiency of similarity detection. At the same time, to handle with the low throughput of reorganization compression module caused by the parallel design, a chunk-prefetching strategy basing on migraterecipe is introduced, reducing disk waiting latency of reorganization process, and boosting the throughput of reorganization compression module.Experiment results based on real-world datasets demonstrate that the parallel migration compression algorithm can increase the compression ratio of traditional compression algorithms by 65%-85%, and reduce the overall time expense.
Keywords/Search Tags:Lookback Window, Migratory Compression, Parallelization, Chunk-Prefetch
PDF Full Text Request
Related items