| With rapid development of sequencing technology, biological sequence data especially DNA sequences increase enormously every year, which requires more storage space and addition communication cost. How to store such steadily growth of DNA sequence data in the limited storage space is a challenge we are facing. Thus, promoting compression for DNA sequence data is now people concerned.After reviewing traditional compression algorithm and the existing DNA sequence data compression algorithm, the thesis explores DNA sequence compression algorithm with both transform coding and entropy coding, and achieved the desirable results. The main contents of the thesis include:(1) Analyze the traditional compression algorithms and DNA sequence compression algorithm after an investigation on the redundancy of DNA sequence, and evaluates both advantages and disadvantages of each algorithm, which establishes a theoretical and ideological basis for the further algorithm research.(2) By referring the principle of Haar wavelet transform, a combination of algorithms, in which arithmetic coding succeeds to Haar wavelet transform, is applied to handle DNA sequence data discussion.(3) Based on the analysis of Burrows-Wheeler transform(BWT), which is used for DNA data transform, the algorithm is improved by adopting both characteristics of the DNA sequence and the idea of forward coding (MTF), which shows a promising result in the DNA sequence data compression.(4) Study the performance of various entropy coding for DNA sequence compression, which are mainly Huffman coding and arithmetic coding. And then the performance of both algorithms in combined with BWT is also evaluated.(5) An applicable DNA sequence compression utility based on BWT is presented, which is then compared with DNACompress with the test sequence.(6) By summarizing both transform coding and entropy coding for DNA sequence compression, a further work including multi-sequence comparison, the improvement of entropy coding dealing with the N characters of DNA sequence, is predicted. |