Font Size: a A A

Research On DNA Storage Method Based On LZW Compression Algorithm

Posted on:2024-08-09Degree:MasterType:Thesis
Country:ChinaCandidate:H S XuFull Text:PDF
GTID:2530307067973189Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the exponential growth of global data volume,traditional storage media have reached their information density limit,and finding a new generation of storage technology has become particularly important.Deoxyribonucleic acid(DNA)molecules,as natural information carriers,have many advantages over traditional storage media,such as high storage density,low maintenance costs,and long lifespan,making them ideal storage media.DNA storage has therefore attracted increasing attention from scientists.At present,DNA data storage is still in its infancy,and there are some problems such as low density of DNA storage,unbalanced GC content,excessive homoperic polymers,and easy to make mistakes.Therefore,it is of great significance to study how to improve the density of DNA data storage to reduce the cost and improve the efficiency.At the same time,it is also an important research direction to design the corresponding algorithm satisfying biological constraints to reduce the error rate in the process of synthesis,replication and sequencing of DNA sequences.In this study,an adaptive coding DNA storage system is proposed,which can adaptively generate coding code streams according to the stored content.The main research of this paper is as follows:(1)Firstly,a DNA storage scheme of adaptive coding is proposed in this study,which can build corresponding coding dictionaries according to different storage contents and generate coding code streams adaptively according to the characteristics of the files to be stored.The storage density of this model is higher than that of the current mainstream encoding schemes,and it is suitable for Chinese and English files in different formats.(2)In order to ensure the robustness of error propagation in stored procedures,an iterative screening scheme combining sequence randomization and ternary rotary coding is adopted.After several base mapping iterations,DNA base sequences satisfying GC content balance and homopolymer constraint(continuous base ≤3)were screened.In this study,the proportion of GC content was 48.8%-52.5%,with continuous base ≤3.(3)Finally,in the DNA storage application layer,the DNA storage software system is developed.In this study,the DNA storage implementation framework,DNA molecular storage model and the whole process simulation experiment were used to encode the storage file into DNA sequence,and horizontal comparison was made with the actual storage density of the current mainstream DNA storage schemes,which showed the effectiveness of this research scheme.
Keywords/Search Tags:DNA data storage, DNA synthesis sequencing, LZW coding, Storage Density, Biochemical constraints
PDF Full Text Request
Related items