Font Size: a A A

Analysis Of Arabidopsis Genes Structure Based On Tiling Array

Posted on:2009-02-25Degree:MasterType:Thesis
Country:ChinaCandidate:C S LinFull Text:PDF
GTID:2120360272990765Subject:Systems Engineering
Abstract/Summary:PDF Full Text Request
In cooperation with Dr.Q.Quinn Li at the Department of Botany, Miami University, this thesis was accomplished on the study of genome re-annotation by means of gene-related program, computer calculations and mathematical algorithms, based on the data from transcription samples of different cell model of Arabidopsis thaliana, which in includes four types: wild, mutant, complementary & DNA, and the research achievements on their Tran scripted structures, both provided by Dr. Li's research group. In the research field of searching new gene algorithm, algorithm for protein structure and function prediction, and data visualization analyses, there always be a interesting and challenging project to find CDS representing protein & RNA gene on the genome sequences, and describe the essence of enormous information existing in non-CDS, by searching useful information and knowledge in huge, incomplete, noisy, obscure & random background data. Along with the development of biology and bioinformatics, genes segment segmentation, as the important preliminary work for genes structure analyses, receives more and more attentions from researchers, and the higher accuracy and validity for genes segment segmentation are also required. While through determining the start-stop positions of CDS by comparing the known genome annotation, as well as the segmenting boundaries between intron and exon, the accuracy and validity of genes segment segmentation were verified by data visualization, and this is very significant for applications on the function of gene and the analyses of transcripts. But the knowledge on errors caused by unknown elements in analysis of Arabidopsis genes structure and choosing of optimized cut-off standard are very limited, and it is inefficient and takes too long time in segmenting or comparing the existing gene sequences, due to the intrinsic defect & noise of bio-chips, and the dispersivity, diversity & complexity of eukaryotic cell gene structure. So far, no article about data obtained after hybridizing reaction of Tiling Array chip in analysis of Arabidopsis genes structure, is reported officially.In this thesis, the investigations on effective methods and data visualization of analysis of Arabidopsis genes structure were conducted by employing various bio-signal processing programs and mathematical algorithms. First, the pretreatments on enormous probed data group were done by DNA reference algorithm, combining Partek software, in order to adjust the sequence-dependent response of the oligonucleotide probes, achieve quantitative comparability of the signal between different probes. Then the simplification of probe data was carried out according to the practical situation in this paper, through section segmentation of data with SCM modeling constructed by dynamic algorithm. The separating points and state parameters in the model are obtained by using a lot of statistical methods on data analyses. The separating results are displayed on self-made ProbeViewer program through inputting the data table of probe intensity, annotation and section information into MySQL database, and aid biologists intuitive analysis of gene structure.
Keywords/Search Tags:Bioinformation, Sequence Segmentation, Analysis of Arabidopsis Genes Structure
PDF Full Text Request
Related items