Font Size: a A A

Study On Gene Identification Using Signal Processing Methods

Posted on:2011-08-09Degree:MasterType:Thesis
Country:ChinaCandidate:Z WangFull Text:PDF
GTID:2198330338483641Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Currently, with the development of computer technology and its applications in bioinformatics, DNA sequencing work is nearing completion. For the large genomic data, identification of different parts of the sequences is one of the most important tasks in bioinformatics. This dissertation is committed to combine the signal analysis and gene identification method to discover the different characteristics of different structures in gene sequences, and then we can use sophisticated signal processing methods for this areas.First, this dissertation introduces the basic theory and development of bioinformatics. And then based on the genome structure of the prokaryotes and eukaryotes, it is concluded that the coding sequences have the feature of 3-period, which can be rarely observed in the non-coding sequences.Second, by the difference of coding and non-coding, we introduce several methods to represent the DNA sequences into digital signal and propose a new mapping way to reduce the dimension of the numerical sequence. Then we can use the methods of signal processing to analyze it, such as statistical correlations analysis, Fourier transform, Wavelet transform and digital filter. By observing the 3-period feature of the sequence, we can identify the coding regions.Finally, we turn the gene identification process into the classification of the DNA sequence. We can use the statistical theory of support vector machine to summarize the characteristics of different parts of DNA sequence in the existing database. Then the SVM can achieve the classification of unknown sequence.
Keywords/Search Tags:Gene identification, coding region, spectrum analysis, support vector machine, sequence analysis
PDF Full Text Request
Related items