Font Size: a A A

On the design and analysis of grammar-based data compression algorithms

Posted on:2003-03-07Degree:Ph.DType:Thesis
University:University of Waterloo (Canada)Candidate:Jia, YunweiFull Text:PDF
GTID:2468390011487933Subject:Engineering
Abstract/Summary:
The grammar-based coding theory is a new development in the field of source coding. It provides a framework to design lossless data compression algorithms by combining the power of string matching and that of statistical modelling. In this theory, each algorithm first transforms a sequence to be compressed into a grammar, and then uses an arithmetic code to compress the grammar. In this thesis research, we investigate the following topics of the design and analysis of grammar-based data compression algorithms: (1) Arithmetic coding: To meet the special requirement on arithmetic coding posed by the grammar-based coding theory, a new arithmetic coding scheme called the multilevel arithmetic coding is proposed to encode sequences from large and unbounded alphabets. The optimality of the multilevel arithmetic coding scheme is established. A greedy renormalization method is presented to significantly reduce the computational complexity of arithmetic coding. (2) Implementation and complexity analysis: An efficient implementation of a specific grammar-based data compression algorithm, the improved sequential algorithm based on the greedy grammar transform, is proposed. It is proven that using the proposed implementation, the improved sequential algorithm has linear computational and storage complexities. The methods and techniques developed in this study can be applied to other grammar-based data compression algorithms. (3) Applications: Several practical issues in applying the grammar-based coding theory to real-world data compression are addressed. These issues include how to design a grammar-based data compression algorithm for a memory-limited environment, how to deal with non-stationary data, and how to efficiently utilize a prior knowledge of a source to be compressed. (4) Extension to image compression: A specific grammar-based data compression algorithm, the Multilevel Pattern Matching (MPM) code, is generalized to compress images, resulting in the 2D MPM code. Context modelling is introduced to the 2D MPM code to further reduce the redundancy and improve the compression performance. Excellent compression performance is demonstrated in experiments for bi-level images.
Keywords/Search Tags:Compression
Related items