Font Size: a A A

Optimization And Transplantation Of Multi-reference Gene Short Sequence Alignment Tool MUGI

Posted on:2019-11-21Degree:MasterType:Thesis
Country:ChinaCandidate:C GuoFull Text:PDF
GTID:2370330566461595Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
DNA carries the biological genetic information and determines the continuation and evolution of the organism.With the rapid development of the next-generation sequencing tools,it becomes easier and easier to obtain genetic data.It has been an era of gene big data today.The short read alignment with large-scale reference sequences is a new subject that has emerged recently.There have been a number of tools today.The key of an excellent large-scale reference sequences tool is concise index design and alignment algorithm matching with the index structure.Based on the two points above,we selected a tool named MUGI which is excellent currently to do research.This paper aims to study the technology of the short read alignment with large-scale reference sequences from the standpoint of software optimization.First,we introduce the background and status of the biometric tools,and analyze the necessity of MUGI optimization and transplantation.We analyze the optimization opportunity in MUGI,and put forward the optimized solution.Below is the main result and contribution of this paper.First,for the problem that the alignment algorithm of MUGI is slower and not targeted enough.We designed a new accurate and a new inaccurate alignment algorithm which are more heuristic than the original algorithms.The new accurate alignment algorithm greatly improves the alignment speed with a small increase in the index size,while the new inaccurate alignment algorithm optimizes the workflow of the original inaccurate algorithm and improves the performance without changing the index.Secondly,for the problem that the algorithm of MUGI is single-threaded which can not exploit the multi-core of the servers.We designed a thread pool for the MUGI alignment algorithm to make full advantage of the server's multi-core architecture.For the problem that the MUGI can not run directly on the Loongson platform,it is the first time MUGI is fully transplanted.At the same time,we used the combination of Loongson's vector components and Loongson Multimedia Extension instructions to optimize using SIMD.We improved the performance while we transplanted the software.Finally,we constructed a framework for modifying the index,which could add different modification algorithms.At the same time,we have designed a modification algorithm based on the relationship between density of the variant and the size of index structure,and finally achieved the effect of reducing the index.
Keywords/Search Tags:Short Read alignment, Large-scale Reference Gene, Thread Pool, Loongson, Optimization, MUGI
PDF Full Text Request
Related items