Font Size: a A A

Short Read Alignment And Assembly Algorithm Of The Next Generation Sequencing

Posted on:2012-07-29Degree:MasterType:Thesis
Country:ChinaCandidate:M LinFull Text:PDF
GTID:2298330335482217Subject:Biological Information Science and Technology
Abstract/Summary:PDF Full Text Request
The next generation sequencing technologies produce read 35~75bp and with huge number. The traditional read alignment and assembly algorithm not fits so much short reads, so it brougt a new challenge. This paper discusses the alignment and assembly algorithm of short reads. Its main work as follow:(1) It analyses the ELAND、MAQ、SOAP algorithm based on hash table and the BOWTIE、BWA、SOAP2 algorithm based on bwt ,explains their principle and steps of run ,compare their performance by simulate data and real data. Seeing form the experiment result ,the algorithm based on bwt better than based on hash table both in memory and time. The MAQ algorithm need mort run time than the algorithm base on bwt, but it need less memory. The SOAP algorithm need more memory than the others.(2) It classfies the assembly algorithm,analyses the SSAKE,VCAKE,VELVET algorithm,and compares them performance by the lacto-genome .See from the experiment result ,velvet has the best performance ,the second is VCAKE ,but SSCAKE need much more time.(3) Then advance a new alignment algorithm based on block index , it also need base on bwt. It first partitions the BWT into several block, compresses them separately, and allocates buffer ,move out the block which longest unused. See from the result, the bigger the buffer, the more quickly it runs, it need less run time than MAQ. If the buffer is small, it run slowly, but needs fewer memory than BOWTIE.(4) At last, it improve the SOAP algorithm, partitions the short read into part A,B,C, reduce the memory need, accelerate the run speed, and compare it to SOAP, the result demonstrates it better than SOAP both in run time and memory need.
Keywords/Search Tags:bwt, de bruijn graph, short read alignment, short read assembly, compressed suffix array
PDF Full Text Request
Related items