With the development of information technology,information propagates faster,followed by an exponential increase in the total amount of information generated.It is estimated that in the incoming two decades,traditional data storage media(e.g.,hard disk drives,tapes,disks,and flash memory)will be not capable to handle the enormous amount of information.In other words,there is no enough medium to store data,which is also called“data crisis”.Thus,it is urgingly demanded to develop a novel data medium with features of high storage capacity and long retention time.Deoxyribonucleic acid(DNA)is a naturally selected molecular information media to store essential blueprints for creatures,which features extremely high storage density and long lifespan.Given these advantages,molecular data storage based on DNA has drawn great attention and has been regarded as a potential solution to deal with the data crisis in the future.The mainstream DNA data storage process includes encoding,DNA synthesis,preservation of the DNA library,DNA sequencing,and decoding.However,there are still many problems in the current process needed to be addressed.First,limited by conventional DNA synthesis techniques,de novo DNA synthesis suffers high costs,low efficiency,high error rate,and long synthesis span.Therefore,the application of DNA data storage systems is hindered by high synthesis costs and efficiency for now.Second,complicated manual operations and sophisticated apparatuses are always involved during DNA synthesis and sequencing steps,which have not been optimized for the application of DNA data storage so far.Therefore,developing a simple,efficient,highly integrated,highly scalable,and flexible DNA data storage system is of great necessity to overcome those barriers and realize wide application.To achieve this goal,this paper combines electrochemistry,microfluidics,and molecular biology,investigates the electrochemical and biological synthesis of DNA,and optimizes the parameters including synthesis efficiency,accuracy,and flexibility based on various characterization methods,aiming to address the problems about high synthesis costs and long synthesis span.Then,by combining in situ DNA sequencing technique,microfabrication,and microfluidic Slipchip,an integrated DNA data storage system is established.This system is capable of conducting DNA synthesis and sequencing processes in tandem to simplify the data writing and reading steps,aiming to address the problems of the separated process and low compatibility between procedures in current systems.The main research contents are shown as follows:1.The feasibility to synthesize DNA on the electrode based on electrochemical methods was verified.Based on phosphoramidite nucleotide chemistry and the electrochemical deprotection methods,DNA strands were synthesized on the electrode modified with gold nanoparticles.The accuracy and surface density(4.1*1013 molecules per cm2)of DNA molecules synthesized on the electrode were characterized using fluorescence in situ hybridization and chronocoulometry,which were further used to optimize the parameters in the synthesis step.The synthesis yield and surface density of DNA after optimization was estimated based on chronocoulometric measurements.In this way,highly efficient DNA synthesis(97.1%step yield)was achieved on the electrode.2.The biological DNA synthesis method was investigated.An editable and reusable DNA data storage structure inspired by movable type printing was designed based on primer exchange reaction.Information was first transformed into corresponding DNA hairpin sequences.The DNA polymerase appended information to the primer using the hairpins as the template.By premixing different combinations of hairpins,corresponding data can be added to the primer.This method exhibited high orthogonality,high efficiency,high storage density,editability,and reusability.The storage and assembly of words,images,texts,and random numbers were demonstrated.The proportion of de novo DNA synthesis was reduced,leading to a decrease in synthesis costs(~880 US$MB-1)and time.Meanwhile,it exhibits a large maximal synthesis length(680 nt)and a low error rate(~1%).The association with the DNA computing technique was also utilized for the conditional search of one entry in the DNA data library,which proved the feasibility of the application of DNA computing to establish a molecular search module in the field of DNA data storage.3.The integrated DNA-based data storage system with scalability was constructed based on integrated DNA synthesis and sequencing,and the microfluidic Slipchip technique.First,the DNA sequencing on the electrode was realized based on charge perturbation and sequencing by synthesis principles.The transient signals in the DNA polymerase-catalyzed primer extension reaction were measured and filtered using statistical methods,then the original DNA sequence was speculated with an average accuracy of 84.1%.The repeated sequencing of DNA strands relied on DNA regeneration via chemical denaturation,which can further improve the accuracy of DNA sequencing to 93.3%based on plurality voting.Afterward,DNA strands with different sequences were synthesized and further sequenced on a 2×2 electrode array.The transportation,switching,and washing during the DNA synthesis and sequencing steps can be implemented by slipping the upper reagent reservoir block.By coupling the data writing and reading steps,the DNA-data storage process was notably simplified with the decrease in the application of auxiliary instruments.Finally,a total of 47bytes of data are stored using this system.This work proposed the prototype of an integrated DNA-based data storage system,providing a way to develop the applicable systems featuring high scalability,throughput,and automation in the future. |