Research On DNA Information Storage Method Based On Huffman Coding

Posted on:2020-01-23

Degree:Master

Type:Thesis

Country:China

Candidate:X M Song

Full Text:PDF

GTID:2428330626452351

Subject:Electronics and Communications Engineering

Abstract/Summary:

With the continuous development of the digital society,the increasing amount of data has brought huge storage burden while promoting the progress of civilization and facilitating people's lives.Over the past two decades,scientists around the world have been looking for data storage solutions with higher storage densities and longer shelf life.As an excellent carrier of biological genetic information,Deoxyribonucleic acid(DNA)has many excellent properties in terms of information storage far beyond disk storage.Therefore,researchers have a strong interest in using DNA as an information storage medium for high-density,low-power,long-term and stable storage of digital information.DNA information storage technology has broad prospects.The use of DNA as a digital information storage means establishing a mapping relationship between the stored content and the base sequence of the DNA molecule,synthesizing the corresponding DNA molecule,and completing the writing of the stored content.DNA sequencing technology can recover the stored content by reading the base sequence of the DNA molecule.There is still a gap between the information storage density of DNA storage technology research and the information storage potential of DNA molecules.It is important to study how to increase the density of DNA information storage to reduce costs and improve performance.At the same time,it is also an important research direction to design corresponding error correction algorithms to reduce the errors introduced in the process of DNA sequence synthesis,replication and sequencing.For the improvement of information storage density,the main work of this paper is as follows:1)A DNA information storage coding scheme with entropy coding is proposed.This scheme establishes the mapping relationship between stored content and DNA base sequence by hexadecimal Huffman DNA coding.The model has a higher storage density than the current domestic and international solutions,and and it is suitable for file formats such as picture,audio,text and so on.2)In the coding scheme,Hamming error correction code is introduced to complete the error correction and error detection function,and the Hamming code is improved based on the characteristics of DNA information storage.By inserting the corresponding error correction base in the base sequence to correct the sequence errors occurring during the storage and reading process,the coding scheme can implement error correction under the premise of introducing lower redundancy as much as possible,and reduce the error rate of information storage.3)A visual interface program is designed to realize the mutual conversion of the stored content to the data stream and the data stream to the base sequence to realize the writing and reading process of the storage work.4)The whole process implementation of the DNA information storage scheme proposed in this paper,including biological experiments,is carried out.Two text documents and one image are stored in the biosynthetic DNA molecule,and the storage file size is 5.2 KB.8,totaling 3,614 bases.A backup base sequence is read by DNA sequencing technology,and the stored content was successfully decoded and restored.

Keywords/Search Tags:

DNA information storage, Storage density, Huffman coding, Error correction coding, DNA synthesis sequencing

Related items

1	Research On Coding Technology For Efficient DNA Information Storage
2	Constraint And Design Of DNA Storage Coding Set Based On Improved MVO Algorithm
3	Research On Color Two Dimensional Coding Technology
4	Research And Implementation Of The Channel Error Correction Coding Of LTE System
5	DNA Storage With Error Correction Mechanism
6	Channel Error Correction Coding Theory And Its Dsp Implementation
7	Research And Design Of File Storage Solution Based On Coded Blockchai
8	Research On Channel Modeling And Coding Techniques Under Base Insertions/Deletions In DNA Storage
9	Research Of Error Correction Coding Used In Oilfield Distribution Network Based On TWACS
10	Storage And Computing Of Matrix On Cryptanalysis