Font Size: a A A

Design And Implementation Of Fast Cross-species Spliced Alignment Algorithm On Genome-Scale

Posted on:2007-06-27Degree:MasterType:Thesis
Country:ChinaCandidate:W ZhangFull Text:PDF
GTID:2178360215970291Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Bioinformatics is a new crossing discipline which utilizes modern computational technology to handle and research the data of biology. Sequence alignment is an important research method in Bioinformatics. It can predict the structures and functions of unknown sequences by aligning them to the known similar sequences. Spliced alignment is a sequence alignment method which considers the different alternative splicing forms during alignment. With the rapid growth of biological sequence and the development of research on the regulatory mechanism of alternative splicing, it's very essential to develop a spliced alignment algorithm with high alignment quality and high time-space efficiency.The paper first analyzes and compares the existing alignment tools carefully. Then we focus on the popular spliced alignment tool sim4. Considering our specific applications, we bring forward a fast splice alignment algorithm on cross-species whole-genome, in which we build index of the sequences and choose the proper hit selection criteria. The purpose is to increase the processing rate without significant loss of sensitivity and make it suit for cross-species alignment. Afterwards we parallel the algorithm based on task partition strategy. Then the tests are performed to compare the new algorithm with the popular alignment tools (Blast, BLAT, sim4, etc.) and the results show that the optimized sim4 have better execution performance.As more of the human genome draft sequence is finished, and genomes from other organisms begin to be sequenced, human beings have stepped into post-genomic time. It's being hotspot right now to study the functions of genes by high throughput biotechnologies. The optimized sim4 is suitable for aligning transcripts to genome across species. It could be use as support to gene recognition and gene map making in genome scale.
Keywords/Search Tags:bioinformatics, sequence alignment, spliced alignment, task partition, parallelization
PDF Full Text Request
Related items