Font Size: a A A

A computational method for genomic sequence analysis

Posted on:2005-09-26Degree:M.S.EngType:Thesis
University:University of Massachusetts LowellCandidate:Kostem, EmrahFull Text:PDF
GTID:2453390008997935Subject:Computer Science
Abstract/Summary:
The ability to efficiently process large data sets, as in the case of available huge genomic data, is a challenging task. In particular for bioinformatics applications such as similarity searches in a genomic database, an abundance of data in the form of nucleotides or amino acids is likely to exist. There is a need and a desire to realize effectiveness in bioinformatics applications, decreasing the complexity of sequence comparison algorithms and tools, used in computational biology. The amount of time needed to analyze genomic sequences plays a key role in the accuracy and biological meaning of the results obtained.; Based on the pioneering work in [1], [2] and through a mapping process, the ability to (1) recognize shapes, and (2) concisely represent the shape of large data using a set of coefficient derived in the mapping process was demonstrated. In [35] [36] [37] [38] [39] it was demonstrated how this method was applied to object representation and recognition, fingerprint, facial, and large data representation and recognition. In this thesis we illustrate how these previous results can be applied to process and analyze genomic data. In particular, in the approach outlined herein, a syntactic representation is formed for genomic sequences whose representation we desire to extract and reproduce compactly. We present the problems and solutions of sequence similarity comparisons in this reduced space. Finally, we show the affect of the nucleotide location on the similarity results and the effect of additional gap information to the phylogeny of the sequences. This research will find applications in the areas of Sequence Analysis, Classification, Phylogenetics, and Drug-Discovery.
Keywords/Search Tags:Genomic, Sequence, Large data, Process
Related items