Font Size: a A A

Evaluation And Comparison Of Multiple Aligners For Next-generation Sequencing Data

Posted on:2014-01-07Degree:MasterType:Thesis
Country:ChinaCandidate:J ShangFull Text:PDF
GTID:2248330398962973Subject:Systems Biology
Abstract/Summary:PDF Full Text Request
Next-generation sequencing technology has rapidly advanced and generated the massivedata volumes. Aligning and mapping the vast quantities of reads has been applied to avariety of genome-wide association studies relevant to next-generation sequencingtechnology, such as RNA-Seq, ChIP-Seq, Resequencing, etc. This is a crucial and essentialrequirement in understanding next-generation sequencing data of various biologicalapplications. To align and map this type of sequencing data, a biologist often randomlyselected multiple aligners without concerning to their suitable feature, high performanceand high accuracy. In this study, we aim to systematically evaluate and compare thecapability of multiple aligners for next-generation sequencing data. Through this, theoverall perspective on the aligners could therefore advise biologists to decide possiblysuitable selection of aligners for their specific application. To explore this capacity, weutilized real-life data and in silico data to perform comparative analysis and furtherevaluation of these aligners focusing on three criteria, namely application-specificalignment features, computational performance and alignment accuracy.Based on evaluated results, Novoalign, and Segemehl represent suitable aligners withmultiple alignment features which can be used for broad applications, such as gappedalignment for single nucleotide polymorphisms (SNPs) discovery and structural variation,paired-end alignment for mapping of repetitive region, bisulfite alignment for ChIPsequencing data analysis. Moreover, SOAP2, RMAP, PASS, Novoalign and PerM areappropriate for short-read aligning and mapping with errors existed, while PASS, SOAP2andNovoalign are adapted to aligning and mapping with indels existed. Moreover, GASSST can becandidate software for long-read aligning and mapping. This study serves as an importantguiding resource for biologists to gain further insight into suitable selection of aligners fora specific application.
Keywords/Search Tags:Alignment algorithms, software evaluation, Next-generation sequencingdata, reads
PDF Full Text Request
Related items