Font Size: a A A

PXAlign: A parallel implementation of the XAlign application

Posted on:2014-03-07Degree:M.SType:Thesis
University:Purdue UniversityCandidate:Magikar, AditiFull Text:PDF
GTID:2458390008458685Subject:Computer Science
Abstract/Summary:
Proteomics involves the assessment of a large number of protein molecules. Mass spectrometry is a proteomic tool that is used for assessment of these protein molecules. The Proteome Discovery Pipeline at Purdue carries out data processing and discovery of proteins using mass spectrometry-based proteomics. The Proteome Discovery Pipeline is divided into stages. Each stage does a different computation task. Currently, each stage of the pipeline is executed in a serial manner. The XAlign stage of the pipeline enables data processing and alignment of the protein peaks across different samples. The XAlign stage deals with vast amounts of data and this can be a potential data processing bottleneck of the pipeline. This stage of the pipeline is currently executed in a serial manner. This causes a bottleneck as the processors cannot process the data fast enough. The thesis work introduces parallelism in the XAlign application code in order to investigate whether it reduces the time needed to process the data. The XAlign application code is implemented using commonly used parallelization techniques called MPI and OpenMP. Parallelization of XAlign could potentially reduce the data bottleneck and lead to a speedup of the XAlign stage of the pipeline and speedup of the overall PDP. This is significant as it would lead to faster processing of samples through the pipeline and lead to more samples being processed in a given time frame.
Keywords/Search Tags:Xalign, Pipeline, Processing
Related items