Font Size: a A A

Research On The Hardware Acceleration For Biological Sequence Analysis

Posted on:2012-09-11Degree:DoctorType:Dissertation
Country:ChinaCandidate:F XiaFull Text:PDF
GTID:1110330341451658Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Biological sequence analysis has become a fundamental task of modern life science and molecular biology. High performance computer systems based on multi-core processors or GPU accelerator are widely used to accelerate sequence analysis algorithms. However, parallel efficiency is greatly limited by bit-wised parallelism, complicated data dependency and tight synchronization resulting from irregular computing and storage features. Thus, efficiently executing the task of bio-sequence analysis on a general-purpose parallel computer becomes very awkward. Recently, the use of FPGA coprocessors has become a promising approach for accelerating bioinformatics applications. In this paper, we propose fine-grained parallelized algorithms and structures for accelerating classical bio-sequence analysis algorithms and models. We also propose a series of strategies to reduce storage requirements and implement massive parallelism and customized computing for bioinformatics applications on a prototype system constructed by general-purpose CPU combination with reconfigurable FPGA chip.Two-dimensional Dynamic Programming ParallelizationTo address the resource-constrained problem resulting from back-tracking operation in large-scale sequence alignment, we propose a fine-grained parallel algorithm. We design an isomorphism linear array template for accelerating 2D dynamic programming lattice filling stage under the conditions of non-backtracking, on-chip-backtracking and off-chip- backtracking.Three-dimensional Dynamic Programming ParallelizationWe address the complicated data dependency and tight synchronization resulting from irregular computing and storage features in RNA secondary structure prediction and propose a fine-grained parallel algorithm template. We design a structure composed of master-slave multiple processing elements for computing 3D dynamic programming lattice in parallel. We partition tasks by columns and assign tasks to processing elements for load balance. We exploit storage reduction and data reuse schemes to reduce memory access overhead.Four-dimensional Dynamic Programming ParallelizationTo address the complicated data dependency and limited storage bandwidth problem, we propose a"time-space domain overlapping"method to analyze the data dependency in RNA secondary structure prediction including pseudoknots—establish a mapping relation between data source and destination, and reduce the randomness of external memory access using dynamic scheduling for data access request. We propose a heterogeneous architecture composed of multiple processing elements for fine-grained hardware implementation on filling 4D dynamic programming matrices. Heuristic-based Database Searching ParallelizationDifferent from the index-based searching approaches, we propose a linear array composed of thousands of processing elements for multi-seeds detection and parallel extension without using looking up tables. We implement non-blocking database searching process by decomposing the detection array, merging successive seeds and multi-channel parallel extension strategies. Our implementation achieves superior performance results in both of processing element number and clock frequency over related works in the area of FPGA BLAST accelerators.HMM–based Stochastic Process Searching ParallelizationBecause of the tightly coupled data dependency, the elements filling process in the computing kernel of HMM–based searching algorithm can't be parallelized. We propose a mixed parallel computing strategy—fine-grained parallelism for single elements computing and coarse-grained parallelism for"model-sequence"matching—to accelerate the database stochastic process searching. Experimental results show that the average performance of single processing element is about 1.3 times as fast as the best implementation of related works.Bayesian Network-based Protein Structure Prediction ParallelizationWe propose a complete fine-grained parallel hardware implementation on FPGA to parallelize the Bayesian statistics and network model for protein structure prediction. We also propose multi-stage mixed pipelining and replication strategy for load balance. We adopt parameter table partition and replication strategies to eliminate the access competition of shared parameter tables. The whole computation structure is carefully pipelined to overlap the sequence loading, computing and back-writing operations as much as possible.Dynamic Reconfigurable Prototype System ConstructionWe construct a reconfigurable heterogeneous prototype system based on high-end FPGA chips and general-purpose processors for biological sequence analysis applications. We develop a sequence analysis toolkit and FPGA configuration file library and implement fast switching over different applications using global dynamic reconfigurable technology. Experimental results show that our prototype system exhibits significant speedup performance and power advantage compared to general-purpose schemes.
Keywords/Search Tags:Biological Sequence Analysis, Reconfigurable Computing, FPGA, Fine-grained Parallelism, Hardware Algorithm Accelerator, Global Dynamic Reconfiguration
PDF Full Text Request
Related items