Font Size: a A A

Prediction Of PiRNAs Based On Relative Composition Of K-mer Strings

Posted on:2013-07-07Degree:MasterType:Thesis
Country:ChinaCandidate:L Y TangFull Text:PDF
GTID:2230330374477043Subject:Basic mathematics
Abstract/Summary:PDF Full Text Request
Bioinformatics is a cross-discipline based on mathematics,computer science and life science etc, which includes all aspects ofthe biological information such as acquisition, processing, storage,distribution, analysis and interpretation, etc. The research content ofbioinformatics is very extensive. Non-Coding RNAs is one of the mostpopular research topics. RNA can be divided into encoding RNAs andnon-coding RNAs in organism. Research shows that non-coding RNAsgenes produce a functional RNA product rather than a translatedprotein. Non-Coding RNAs include many important RNAs. Among them,the ones of typically20~30nt in length are called small RNA. SmallNon-coding RNAs are abundant in higher organisms, formed a verycomplex biological regulatory networks. So far, three types of smallnon-coding RNAs have been found in eukaryotes. They are miRNA,siRNA and piRNA. At present, there are experimental andcomputational analysis methods to detect miRNA. Experimentalmethods include cloning and sequencing, miRNA microarray andhybridization experiments. Computational methods are based ondecision tree and machine learning such as k-grams, support vectormachine and bayesian statistics. piRNA (PIWI-interacting RNA), with alength of25~32nt in general, is a novel class of small RNAs. piRNA exist inmany species, such as human, rat, mouse, fruit fly and so on. Aneffective and reliable prediction system for human piRNA is necessary.In this paper, we use all3401~4nt strings to characterize humanpiRNA sequences, and each feature vector is confirmed by theweighting scheme. Then, the support vector machine (SVM) is used toperform the prediction. Also, all experimental results are achieved by5-fold cross validation. The result shows that our method provides apreferable performance to predict human piRNAs.
Keywords/Search Tags:Bioinformatics, Non-Coding RNAs, miRNA, piRNA, k-merstring, Support vector machine
PDF Full Text Request
Related items