Font Size: a A A

Massive Data Compression Algorithm In Parallel

Posted on:2007-12-19Degree:MasterType:Thesis
Country:ChinaCandidate:C B TangFull Text:PDF
GTID:2208360185473139Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the development of information technology applying to productive practice .computer and productive practice combine more tightly, some productive department in which data is large need a large calculation when it deal with these data.It leads to memory capacity become large,for storing this information need large memory room.This paper study the storage problem about large capacity datum which the calculation produced in practice.I put forward the method applying parallel program technology to data compression.I wish I can achieve the fast compression speed.in the same time saving the storage room.Storage after compression about large datum involve two technology:one is data compression technology.In my paper,I adopt adaptive arithmetic coding,by which I want to solve the storage room problem.the other is parallel program technology,by which I want to solve compression speed problem.This paper seek after the solution which combine adaptive arithmetic coding to parallel program technology.It enable storage theory after compression apply to practice.Adaptive arithmetic coding is developed on the base of Huffman coding .It avoid the Huffman coding's shortage allocating the code to every symbol.Adaptive arithmetic coding allocate a code to the whole input stream,which enable code's length be close to the entropy value.It is a mature method.There are a lot of mutation about this arithmetic.So in my paper I choose the adaptive arithmetic coding to encode the input stream.for this method has an advantage of encoding the text file.Parallel program technology developed for large datum.which based on parallel hardware setting.The technology about this is developing flourishly.In china there are a lot of research group studying parallel program technology and hardware setting.My scheme utilize workstation in Lab. Connecting them by Local net to form a computer system cluster.by using this cluster I realize my parallel compression scheme.In my program I use the popular MPI(message passing interface) standard, which is based on C Language and Fortran.C compiler can call the function in MPI Library.So in C environment I program parallel programs conveniently.In my paper,aimming at a practical utilization and LandMark's data file and task-allocating in parallel program.I put forward the theory which allocate original data according to the rule that entropy value equal in every sub-block.The aim is processing synchronization in every thread,easing the burden of conformity module,in the same time insuring the compressed file arrange orderly.for the decision of context's rank,I put forward the theory deciding the rank according to the maximal probability,which insure the arithmetic's efficiency is good.My scheme combine the two advanced technology, I deal with the arithmetic coding in detail and realize the scheme in Lab.It strengthens the two method's merit and achieve good result.
Keywords/Search Tags:Data Compression, Adaptive Arithmetic Coding, Parallel Program Technology, MPI Interface
PDF Full Text Request
Related items