Font Size: a A A

An Algorithm Of Gene Exon Prediction Based On GMM

Posted on:2013-09-19Degree:MasterType:Thesis
Country:ChinaCandidate:X Y ZhangFull Text:PDF
GTID:2248330362461767Subject:Circuits and Systems
Abstract/Summary:PDF Full Text Request
Bioinformatics have become a popular research field due to the development of science and technology. As the accomplishment of the Human Genome Sequencing Project, the number of sequences and bases has increased in the form of exponent in the three main databases of nuclear acid GenBank、EMBL and DDBJ. Thus, it is a necessary part to recognize the exon part of the gene sequences in order to handle with the DNA sequences which grow explosively. This paper combine the signal processing method with biometrics of gene sequence and could predict the exon in the gene sequences based on its periodicity in the coding regions.This paper makes a detailed introduction about gene exon prediction from four aspects. This first part is mainly about background knowledge about bioinformatics and on this condition the research situation and meaning about gene prediction are further illustrated. The second part presents the numeric mapping method according to the biological characteristic. Besides, the petidd-3 feature about sequences in the coding regions is explained clearly in order to lay a foundation for the feature extraction. Feature extraction is the key part of the paper. This paper has described usual method for feature extraction from time and frequency domain such as average magnitude difference function, singular value decomposition, DFT transform, paired and weighted spectral rotation measure and so on. What’s more, information entropy and sum of magnitude difference square function which has been used in music signal processing widely have been used in gene prediction and they have achieved good results. In the final, the paper associates time with frequency domain features and makes gene prediction using multi—dimensional features through statistical learning method. The gaussian mixture model is chose as classifier to predict the exon in the test database according to the parameters which could get through training. The centre algorithm of GMM—expectation and maximization is discussed in detail.In sum, the paper deals with the biological signal well using signal processing method and predicts the exon in the gene sequence successfully after the deep research of gene sequence.
Keywords/Search Tags:DNA sequence, period-3 behavior, SMDSF, information entropy, Gaussian mixture model
PDF Full Text Request
Related items