Font Size: a A A

A heterogeneous computing platform for biological sequence database searches

Posted on:2008-06-02Degree:Ph.DType:Dissertation
University:Wayne State UniversityCandidate:Meng, XiandongFull Text:PDF
GTID:1448390005964932Subject:Computer Science
Abstract/Summary:
Due to the high cost of dedicated parallel computing and supercomputing machines, the High Performance Computing (HPC) for today's enterprise computing infrastructure has emerged as a heterogeneous computing architecture that allows us to integrate the new commercial-off-the-shelf components or innovative implementation through the extension of services. To enable low cost HPC, we have developed a hybrid-computing platform, Wayne Bio-Accelerator (WaBA), for high-throughput biological sequence analysis utilizing the existing enterprise computing infrastructure as well as various general-purpose computer architectures via the network.; WaBA is a heterogeneous computing platform that integrates heterogeneous computer architectures including legacy processors, conventional processors with SSE2 instructions, and reconfigurable coprocessors together into one system, and allows each to perform the task to which it is best suited. Accurate biological sequence database search algorithms like the Smith-Waterman algorithm are the most sensitive, but their high computational complexity limits their use. WaBA effectively accelerates the most sensitive and time-consuming Smith-Waterman biological sequence database search core with dynamic load balancing, data pre-fetching, database and query segmentations, and a series of optimizations.; The WaBA scheduling strategy automatically distributes the workload across multiple heterogeneous processors based on their processing capabilities. An efficient adaptive data pre-fetching scheme was designed for slow IO interfaces like PCI-based reconfigurable computing systems to overlap the communication and computation time. The implementation effectively eliminated a major portion of data access penalty and improved the performance by up to 42%. We also developed a list of WaBA API functions which hide the complexity of hardware programming and data format conversion for seamlessly connecting the WaBA accelerator to the existing biological sequence search tools. Furthermore, the parallel SSE2 implementation obtained a speedup of 143 on a cluster of 16 dual CPU Intel processors as compared to the sequential version that was widely used at the time. Additionally the WaBA The WaBA heterogeneous computing system demonstrated a speedup of 110 by utilizing only one reconfigurable coprocessor and 8 dual core AMD processors. Clearly, the integrated heterogeneous computing architecture can support the data and compute intensive life science applications at low cost.
Keywords/Search Tags:Computing, Biological sequence, Cost, Processors, Search, Platform, Waba
Related items