Font Size: a A A

Research On Gene Prediction Based On Theory And Methods Of Signal Processing

Posted on:2009-10-19Degree:DoctorType:Dissertation
Country:ChinaCandidate:B S MaFull Text:PDF
GTID:1118360272487446Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
Gene is the basic unit of heredity and is DNA segment with genetic information, but non-gene can not code protein, so finding genes in DNA sequences has been an important problem in bioinformatics for a long time. In this dissertation, theory and methods of signal processing including transform domain, digital filter, time-frequency analysis, statistical learning and intelligent algorithms are applied to identify genes.Firstly, theory of filter for gene prediction is analyzed, therefore two important elements are proposed: length and weak/strong periodicity of protein coding regions. According to the periodicity of the coding regions, the FIR filter and the adaptive filters with narrow pass-band are developed. The predicted locations of the exons are achieved by calculating the annotated gene sequence. The experimental results indicate that the designed filters are valid and can improve accuracy of gene identification.Secondly, an improved Fourier transform approach is proposed by integrating the gene prediction filter with the Fourier transform. This algorithm can magnify period-3 signals, remove the background noises, and is not restricted by the length of the predicted sequences unlike the existing Fourier methods. The experimental results show that the improved Fourier method can promote predictive accuracy. At the same time, the Fourier method based on sliding window is applied to identify the coding and noncoding regions in DNA sequences.Thirdly, the forward algorithm integrated with the Hidden Markov Model of coding regions is applied to predict exons in genes. By identifying the annotated gene sequence the designed algorithm is valid and reduces the computational complication. At the same time, the algorithm for gene classification based on support vector machine is schemed. The experimental results show that the proposed method can not only improve accuracy, but also reduce training data.At last, four features and three discriminate analysis methods are studied for improving predictive accuracy, and the gene identification algorithm based on multiple features is proposed. The experimental results indicate that the developed algorithm can improve the Fourier methods and has better accuracy than the existing gene prediction method for short DNA sequences.
Keywords/Search Tags:Bioinformatics, Gene Prediction, Fourier Transform, Filter, Hidden Markov Model, Support Vector Machine
PDF Full Text Request
Related items