Font Size: a A A

Study On Parallel Algorithms For Longest Common Subsequence On Heterogeneous Cluster Computing Systems

Posted on:2009-01-09Degree:MasterType:Thesis
Country:ChinaCandidate:L L XuFull Text:PDF
GTID:2178360245468233Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Solving the longest common subsequence (LCS) of any given strings is one of the basic and important research problems in computer science. It is a approximate string matching problem which is only allowed to do the editing operations of insertion and deletion. The longest common subsequence has been widely applying to many areas such as biological sequence similarity analysis, network intrusion detection, network remote instruction, E-commerce, information retrieval, data mining and automatic proposition. With the growth of string sequence, the rapid serial algorithm to solve LCS problem also appears inefficient. Due to high performance and low cost of the cluster computing systems, parallel processing for the longest common subsequence on the heterogeneous cluster computing systems is very meaningful in practice.For the LCS problem with multiple sequences, based on the optimality principle of divisible load theory and the fixed sequence of objective strings distribution, an optimal objective strings distribution strategy is presented on the heterogeneous cluster computing systems that processors have different computing speeds and communication capabilities. With this strategy every slave processor starts the implementation of serial LCS algorithm and finishes the computing at the same time with the fixed sequence of objective strings distribution, so that the required parallel processing time for parallel LCS Algorithm is to obtain minimum value. The algorithms analysis and experimental results on the cluster of personal computers show that the required parallel processing time for parallel LCS Algorithm applying the optimal objective strings distribution strategy decreases 6 ~ 32% compared to dividing the objective strings equally.For the pair-wise sequence LCS problem, and the assumption that the distribution of processors in a fixed order, an optimal sequence distribution strategy is first presented by dividing into the dynamic programming matrix of pair-wise sequence and its corresponding closed-form expressions are given on the heterogeneous cluster computing systems that processors have different computing speeds and communication capabilities. By coordinating the communications among each slave processors, the time of parallel pair-wise sequence LCS algorithm can be minimized. Algorithm analysis and experimental results show that the required parallel processing time for parallel pair-wise sequence LCS problem reduces significantly and obtain a good acceleration and scalability using the optimal sequence distribution strategy compared to.the average allocation strategy.
Keywords/Search Tags:longest common subsequence, parallel algorithm, heterogeneous cluster systems, divisible load, distribution strategy
PDF Full Text Request
Related items