Design And Implementation Of Improved Huffman-based Context Data Compression Algorithm

Posted on:2018-08-28

Degree:Master

Type:Thesis

Country:China

Candidate:K Y Qu

Full Text:PDF

GTID:2348330518994574

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

With the rapid development of Internet technology, there are huge amounts of data needs to be stored and processed in many fields, such as biological engineering, digital broadcasting, high definition television,aerospace, national defense and so on, which brings a great challenge to the computer disk storage and network transmission performance. The compression of massive data has got more and more attention because it can further save storage space and improve the speed of network transmission.At first, this paper discusses the classical data compression algorithms and their principles. Due to its high compression efficiency,high encoding speed, easy to implement, Huffman algorithm has been widely used in the field of data compression. Although it shows good performance of compression, the Huffman algorithm don't take into account the relationship of context to be compressed. As a result, the data redundancy is still very large and taking up much disk space,compression effect remains to be further improved. This paper presents a Huffman compression algorithm based on Markov chain, which reduces the average coding length of traditional Huffman algorithm,thus, we can get better compression effect. Firstly, the Markov model was utilized to explore the correlation between the data to be compressed. In this algorithm, every character of the data to be compressed is regarded as a state, and then occurrence times of each state and the transfer times between all states are figured out. According the statistical results, we can get the transition probability matrix and judge the correlation between the contexts of compression data. Finally,the algorithm compresses the context that has the correlation to realize better compression effect.When the amount of data is large , the standalone is unable to make full use of the compression algorithm because its processing capability is insufficient. Therefore, this paper transplants the data compression algorithm to the HADOOP, which is a most popular distributed computing system. In this way, we can realize parallel compression for large amounts of data and greatly improve the compression efficiency.The results of experiment show that it can not only shorten the time of the data compression, but also reduce the data storage space and further accelerate transmission speed in the network.

Keywords/Search Tags:

Markov Chain Model, Huffman Algorithm, Data Compression, Context

PDF Full Text Request

Related items

1	Research And Application Of Huffman Coding In Environmental Protection Real Monitoring System
2	Research On Improving The Efficiency Of Hardware-Based Lossless Data Compression By Using Huffman Encoding
3	The Application Of Lossless Data Compression Based On Eembedded System
4	Research On WSN Data Compression Based On Predictive Class Algorithm
5	Research On Disaster Recovery Oriented Key Technology Of Lossless Data Compression
6	The Optimization Of LZ4 Lossless Compression Algorithm Based On FPGA
7	Research On The Compression Algorithm Of PMU Synchronous Sampling Data
8	Lossless Data Compression,Algorithms Comparisons And Implementation.
9	Unsupervised Context-based Fused Segmentation Of SAR Image Based On Multiscale Markov Model
10	Eeg Signal Acquisition System Wireless Data Compression Algorithm Research