Font Size: a A A

Finding patterns in DNA sequences through visualization with symbolic scatter plots

Posted on:2011-10-19Degree:Ph.DType:Thesis
University:North Carolina State UniversityCandidate:Cox, David NFull Text:PDF
GTID:2468390011971367Subject:Biology
Abstract/Summary:
Visualization is frequently mentioned as a technique for analyzing large amounts of data. It has been widely anticipated for many years that visualization would become a major tool for the analysis of rapidly growing genomic databases. However, beyond the dot plot which was introduced in 1981 there have been few successful attempts at visualizing this data.;In this thesis a new technique for visualizing DNA sequences, the symbolic scatter plot, is introduced. It is shown how the symbolic scatter plot addresses the problems of (1) finding complex patterns in DNA sequences and (2) the comparison of sequences. Second, the symbolic scatter plot is analyzed in terms of human visual perception---particularly in terms of Gestalt theory and pre-attentive visual processing. Third, examples of how specific pre-attentive visual cues can be manipulated or added to find motifs and visualize information content (i.e. entropy) are presented. Fourth, the practicality of symbolic scatter plots is demonstrated by using them to visualize and compare the human and chimpanzee genes responsible for Huntington's disease.
Keywords/Search Tags:DNA sequences, Scatter plot, Visual, Symbolic scatter
Related items