Font Size: a A A

Eukaryotic DNA Splice Site Identification Based On The Machine Learning Approach

Posted on:2008-10-14Degree:MasterType:Thesis
Country:ChinaCandidate:Y M XueFull Text:PDF
GTID:2178360218952800Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the human genome project to be completed, the function of the human genome to be completely understood is becoming an essential target in the post-genome time. In order to achieve this goal, we need obtain the useful information from some partial key questions, in which the recognition of splice sites of the eukaryotic DNA is important and key question. The recognition of splice sites is a very important part in the detection of the eukaryotic DNA. To distinguish the real splice sites from the massive sequences conforming to the GT-AG rule, essentially, is the question on the pattern classification. Because of introducing the machine learning method in the recognition of splice sites of the eukaryotic DNA, the recognition rate of splice sites is largely heightened.The splice sites of the eukaryotic DNA are the region of the vicinity in the exons and introns of the DNA sequences of the eukaryotic cell biology. If we can distinguish the real splice sites of the eukaryotic DNA sequence, the expressed regions and non-expressed regions of the eukaryotic DNA sequences will be detected. The HMM algorithm is adopted for better prediction performance of splice sites in designing and constructing the splice site identification system. According to the conservation feature in the vicinity of splice sites, the HM-SVM training set optimization algorithm is used for training and optimizing the HMM model constructed, which efficiently extracts the statistical features of the margin and condition distribution of conservation sequences in the vicinity of splice sites. The experimental results show that the HMM identification system acquires higher rate in the splice site identification than popular machine learning techniques.
Keywords/Search Tags:Hidden Markov Model, HM-SVM, Splice Site, Identification, Cross Validation
PDF Full Text Request
Related items