Font Size: a A A

Research On Pairwise Constratints Dimentionaliry Reduction And Classifier Ensemble For MicroRNA Recognition

Posted on:2012-07-09Degree:MasterType:Thesis
Country:ChinaCandidate:S WeiFull Text:PDF
GTID:2210330338474022Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
MicroRNAs(MiRNAs) called micro molecule RNA are a family of RNA. At present, numerous studies show that miRNAs are related to gene expression, growth and behavior of biology. Earlier experimental cloning methods for searching miRNAs were low efficient, time consuming and very expensive, the identified accuracies are not satisfied. Then, machine learning techniques are proposed to recognize miRNAs and provide a new idea for large-scale prediction. Therefore, in this paper, we study deeply on the machine learning methods for recognizing miRNA and three new algorithms are proposed to improve the accuracy.1. Propose a semi-supervised dimensionality reduction algorithm LSLDA based on pairwise constraints. Studying the existing machine learning methods for recognizing miRNA shows that, most of them extract features from sequences and secondary structures of miRNAs based on biological theory. However, whether some of the extracted features influence the classification results are not considered much. Therefore, the dimensionality reduction method based on pairwise constraints is employed to improve the performance by removing the features with less contribution for classification. Comparing with the results on original training data set, LSLDA has a better improvement on time efficiency and performance.2. Propose an ensemble algorithm called En-LSLDA based on pairwise constraints. The LSLDA algorithm can extract good features by dimension reduction, but it can not overcome the uncertainty of the number of the pairwise constraints (different numbers and different contents of pairwise constraints will lead to different classification results). Due to the number of pairwise constraints corresponding to the best accuracy is unknown, we ensemble the base classifiers on each numbers of pairwise constraints to improve the whole accuracy. Experimental results validate the performance of the proposed algorithm.3. Propose an ensemble algorithm of heterogeneous classifiers EnH-LSLDA. An effective ensemble algorithm requires the base classifiers with high accuracy as well as diversity, so we choose a number of diverse subspaces from extracted low-dimensional spaces. High performance and diverse base classifiers are obtained after heterogeneous classifiers on these subspaces are learned. Then, majority voting is used to obtain the a good ensemble classifier. Experimental results on miRNA data sets and UCI data sets show the effectiveness of the newly proposed algorithm.
Keywords/Search Tags:MiRNA, Pairwise Constraints, Dimensionality Reduction, Ensemble Learning, Prediction
PDF Full Text Request
Related items