Font Size: a A A

Prediction Of Alternative Splice Site And Analysis Of Sequence Characteristic In Human Genome

Posted on:2011-09-29Degree:MasterType:Thesis
Country:ChinaCandidate:P F ZhangFull Text:PDF
GTID:2120360305991331Subject:Physics
Abstract/Summary:PDF Full Text Request
Alternative processing of mRNA is a basic distinction between eukaryotes and prokaryotes, which is a key mechanism enriching proteomic diversity and functional complexity of higher multicellular eukaryotes by producing several transcripts from single gene. Alternative splicing of pre-mRNA is specific to different stages of development and particular tissues of organism. Moreover, it plays an important role in development, differentiation and cancer of system. Firstly, in this paper some basic conservation features and the spatial structure characteristics of splice sites as well as pseudo splice sites in human genome were analysed, and based on the conservation of nucleotides and spatial structure characteristics of splice sites upstream and downstream regions, the information vector of splice sites is constructed. Secondly, the support vector machine (SVM) models combined with the features of information vector are developed and used to predict the donor and acceptor spice sites of human genome. For five-fold cross-validation, the total prediction accuracies are 92.55% and 90.70% for donors and acceptors respectively. For three-way data split, the total accuracies are 92.25% and 89.87% for donors and acceptors, respectively. On the sequence level, there is no obvious difference between alternative and constitutive splice sites. Moreover, the distances between two donor (or acceptor) sites for the same exon are very close in alternative splicing events. Therefore, it is still a challenge for the theoretical prediction of alternative splicing sites. In this paper, based on position-correlation weight matrix (PCWM) and DNA structural information, an approach for predicting the alternative splice sites is presented. The predictive success rates are 73.32% and 74.62% respectively for donor sites and acceptor sites. The prediction results are better than the recent methods which are based on the mechanism of splice site competition.
Keywords/Search Tags:alternative splice sites, constitutive splice sites, position-correlation weight matrix, DNA structural information, Support vector machine
PDF Full Text Request
Related items