Font Size: a A A

Prediction Of Plant MicroRNA Using Support Vector Machine

Posted on:2015-10-11Degree:MasterType:Thesis
Country:ChinaCandidate:C SunFull Text:PDF
GTID:2180330467986598Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
MicroRNAs (miRNAs) are a family of-21nt non-coding RNAs that play important roles at the post-transcriptional level in animals, plants and viruses. These molecules can identify target genes in order to prevent them from transcription or translation. A large number of researches have shown that miRNAs are involved in the biological response to a variety of biological and non-biological effects. Identification of these molecules and their targets can aid understanding of regulatory processes.Recently, calculating prediction method based on machine learning has been widely used for predicting biological miRNAs, researching miRNA target genes, analysising miRNA-miRNA interaction and many other purposes. However, most of these methods are only for mammals or can not predict miRNAs within pre-miRNAs. Therefore, miPlantPreMat, an integrated classification model based on support vector machine (SVM) was trained specifically for distinguishing real/pseudo plant pre-miRNAs together with their miRNAs.Since the first miRNA had been found, a lot of features related to miRNA were put forward in scientific community. By summing these features and a conventional plant miRNA sequence analysis, an initial set of152novel features related to sequential structure were picked. From a comprehensive analysis of the information gain value of each property and the contribution rate in the SVM classification process, we proposed an improved SVM-RFE method, attribute selection made by this method together with SMOTE method and grid search parameters, the classification results improved. Test results on arabidopsis, soybean, tomato plants and other6kinds of data sets show that the method of miRNA precursors and mature plants classification has high classification accuracy and universal characteristics. A total of522potential tomato miRNA were predicted from the tomato genome by miPlantPreMat.3214miRNA target genes interacting were derived by psRNATarget. Finally, a miRNA target gene interaction networks were constructed through these relationships which provides a reference for experimental biology.In summary, this paper successfully using SVM-based and structural features of plant classification models and algorithms to solve the classification problems of pre-miRNAs and mature miRNAs. Experimental analysis on plants and simulated data show that, either from feature quality or classifiers capacity, the effect has been improved.
Keywords/Search Tags:miRNA, classification, support vector machine, feature selection
PDF Full Text Request
Related items