Font Size: a A A

Parallel external memory suffix array construction

Posted on:2010-02-19Degree:M.SType:Thesis
University:University of Maryland, Baltimore CountyCandidate:Walia, NancyFull Text:PDF
GTID:2448390002980586Subject:Computer Science
Abstract/Summary:
A suffix array is a simple and powerful data structure that has numerous applications such as string matching and text compression. Many algorithms have been written to efficiently construct suffix arrays. One of the linear time algorithms, DC-3 has already been converted into an external memory as well as into a scalable internal memory parallel algorithm. However, there are situations in which there is a need for computing in external memory, while scalable parallelism is needed at the same time in order to speed up the computation. We have developed the first parallel external memory DC-3 algorithm. We have implemented our parallel external memory DC-3 algorithm using C and MPI. Measurements were performed on the Bluegrit cluster using IBM JS-20s and PVFS file system. The experiments show a good scaling factor and result in the conclusion that it is likely that our algorithm can be used to construct extremely large suffix arrays much faster.
Keywords/Search Tags:Suffix, External memory, Algorithm
Related items