Font Size: a A A

Dna Pattern Analysis Algorithms And Software

Posted on:2007-12-29Degree:MasterType:Thesis
Country:ChinaCandidate:Y Y LinFull Text:PDF
GTID:2190360182490530Subject:Optical Engineering
Abstract/Summary:PDF Full Text Request
One of the important goals of the Human Genome Project is the development of technologies for automated, high-speed DNA sequencing. The four dye fluorescence based capillary electrophoresis DNA sequencing is the most widely used approach to automated DNA sequence analysis. DNA chromatogram' s signal-to-noise-ratio and resolution are decreased as the influence of environmental noise, substantially overlapping emission spectra of four dyes, electrophoretic zone broadening, dye mobility-shift, and so on. The raw DNA chromatogram should be processed to facilitate base-calling. The thesis focuses on algorithm for DNA chromatogram processing, and design corresponding software. The main function of the software consists of preprocessing, cross-talk filtering and postprocessing of DNA chromatogram.Firstly, the software performs preprocessing of DNA chromatogram. Preprocessing consists of data selection, baseline adjustment, noise filtering and peak identification. Data selection removes preprimer data section which contains little useful information. Baseline adjustment adjusts the baselines of four channels to the same level. Noise filtering removes the impulse noise and white noise. Peak identification selects the peak data for use in cross matrix determination.Secondly, the software performs Cross-talk filtering of DNA chromatogram. Cross-talk filtering is an important aspect of data processing in DNA chromatogram processing. It deduces the concentrations of four fluorophores from fluorescence emission intensities at four different wavelengths. As the four dyes employed in labeling DNA fragments have substantially overlapping emission spectra, cross-talking filtering is important and necessary. The main task of cross-talk filtering is determination of cross matrix. The matrix is typically determined by peak analysis, slope analysis or four dimensional cluster analysis. The software implements slope analysis and four dimensional cluster analysis to determine the matrix, and applies the matrix to the raw signal to accomplishcross-talk filtering.Lastly, the software performs Cross-talk filtering of DNA chromatogram. Postprocessing consists of deconvolution, dye mobility-shift correction and signal normalization. Deconvolution partially reverses the effects of zone broadening, which decreases the peak width and eliminates the peak overlapping, leading to resolution improvement. Dye mobility-shift correction adjusts the relative position of peaks between different channels, and makes the peaks proportional distributed. Signal normalization normalizes the peak intensity among channels, which improves DNA chromatogram' s visual effect.DNA chromatogram processing, which is prepared for base-calling, facilitates automated and precise DNA sequencing. The software called GelRead we developed is aimed at viewing and processing of DNA chromatogram in ABI and DAT formats. GelRead works well, and it compares beauty with commercial softwares like ABI. The thesis achieves the prospective goal.
Keywords/Search Tags:DNA chromatogram, concentration of fluorophore, preprocessing, cross-talk filtering, postprocessing
PDF Full Text Request
Related items