Font Size: a A A

Identification Of Coding And Non-coding Sequences In A Complete Genome Using H(?)lder Exponent Formalism And Multiaffinity Analysis

Posted on:2008-04-22Degree:MasterType:Thesis
Country:ChinaCandidate:X C LiFull Text:PDF
GTID:2120360218457880Subject:Applied Mathematics
Abstract/Summary:PDF Full Text Request
Accurate prediction of genes in genomes has always been a challenging taskfor bioinformaticians and computational biologists. Therefore, the discovery ofexistence of distinct scaling relations in coding and non-coding sequences hasled to new perspectives in the understanding of the DNA sequences. This hasmotivated us to find new ways for characterization and classification of codingand non-coding sequences.In this thesis, we first introduce a number sequence representation of DNAsequences proposed by our group. Multiaffinity analysis and Holder formalism arethen performed on the representation of the obtained number sequence. Threesuited exponents are selected to form a parameter space. The two exponentsγ(-2),γ(6) are from Multiaffinity analysis, the exponent h is from Holder for-malism. Each coding or non-coding sequence may be represented by a point inthe three-dimensional parameter space. We can see the points corresponding tocoding and non-coding sequences in the complete genome of many prokaryotesbe divided to different regions roughly. If the point (γ(-2),γ(6), h) for a DNAsequence is situated in the region corresponding to coding sequences, the sequenceis discriminated as a coding sequence; otherwise, the sequence is classified as anon-coding one. Therefore these exponents can be used to distinguish coding andnon-coding sequences. The Fisher's discriminant algorithm is used to give thediscriminant accuracies. The average discriminant accuracies pc, pnc, qc and qnc ofall 51 prokaryotes obtained by the present method reach 66.53%, 83.34%, 71.63%and 83.54%. respectively.
Keywords/Search Tags:coding/noncoding sequences, Genome, multiaffinity analysis, wavelet transform modulus maxima methodology, Ho|¨lder exponent
PDF Full Text Request
Related items