Font Size: a A A

DNA Sequence Alignment Results Are Stored And Compressed

Posted on:2013-09-01Degree:MasterType:Thesis
Country:ChinaCandidate:C ChenFull Text:PDF
GTID:2208330434970421Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the development of automated DNA sequencing techniques and the completion of a series of genome sequencing projects, more and more of the human genome and other modes of life gene is sequenced. Sequence alignment is one of the most core biological data processing methods to process DNA data. It can found the structure, function, and evolution of the relationship between the biological DNA sequences.These sequencing projects generate vast amounts of DNA sequence data every day.^It makes the amount of DNA sequence data to an exponential growth, while alignments outputs trend to explosive growth. Although the rapid development of the storage device alleviates the problem of the rapid expansion of the amount of sequence data to some degree, with further sequencing studies only depending on additional hardware equipment has been unable to meet the rapidly growing demand of the amount of DNA data. With the further study, the cost of storage and use of these data will eventually increase to the scale that we can’t assume.The next-generation sequencing technology platform (NGS) has reduced the overhead cost of DNA sequencing considerably, so it becomes possible to practice the gene sequence analysis applications among the medical scene. Therefore, whether considered from the storage or application, the sequence alignment outputs compression plays an important role in the DNA data storage, management and transmission. DNA sequence data compression has caused widespread concern in academic circles. However, few scholars study how to compression the sequence alignment results in medical scenarios.the storage problem of DNA sequence alignment results is still a big challenge remained.In this paper, we try to propose a novel idea to tackle the alignments output storage problem under clinical scenarios. We design the appropriate storage structures and compression strategy to reduce the vast amounts of storage space consideration. Experimental data show that when the coverage goes high, our compression schemes are slightly better than RAR and ZIP.We complete DNA sequence alignment results storage and compression system to provide a graphical interface.
Keywords/Search Tags:DNA sequence alignment outputs, storage, compression
PDF Full Text Request
Related items