Font Size: a A A

MiRNA Target Gene Prediction On Data Imbanlance Problem Analysis Based On Support Vector Machine

Posted on:2013-07-18Degree:MasterType:Thesis
Country:ChinaCandidate:P P ZhaoFull Text:PDF
GTID:2230330392454881Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
microRNAs (miRNAs) is a class of crucial gene regulators in the organism. It adjuststhe growth and development of the organism to achieve cleaving or arresting mRNA bycompletely complementary pairing and imprecise complementary pairing with targets.Therefore, the prediction of miRNA target plays crucial role in researching andunderstanding the biological functions of miRNA. However, for the number of thenegative targets, the number of the experimental verified positive is too little. The numberof the positive and the negative emerge serious imbalance situation. Predicting accuratelymiRNA target has been the bottlenecks problem for researching miRNA. For theimbalance data problems of miRNA target, this paper proposes the bias discriminant SVMprediction algorithm and the algorithm based on SVM including different penaltyparameters. They combines with miRNA:target sequence information and the research onSVM prediction for imbalance data, so this paper makes a new attempt to improve theaccuracy of miRNA target prediction.Firstly, this paper makes some quantitative criterias by integrating withcorresponding features of miRNA:target secondary structure and3’ UTR region, anddesigns a program to extract features data and select the optimal feature subset.Secondly, for lower accuracy of traditional SVM for classing targets, this paperproposes a new algorithm, BD-SVM. BD-SVM uses bias discriminant analysis criteria askernel optimization objective function in the empirical feature space. Then BD-SVM useskernel conformal transformation method to optimize the kernel matrix gradually. After that,BD-SVM uses the optimal kernel matrix to construct itself, by which predicts miRNAtargets.Thirdly, while the dataset is imbalance, for difference classification effects of SVMbased on difference penalty parameter, this paper proposes a target prediction algorithmwhich calculates the penalty parameter by the average density of miRNA target dataset.The algorithm makes up for the deflection resulting from the target sample distribution inimbalance sample space. Lastly, this paper utilizes the positive of human, mouse and rats from miRecords asthe positive training data. Consulting the result of microarray experiment, this paperselects the negative training data from NCBI Gene Expression Omnibus (GEO). Thispaper utilizes the independent test dataset to predict the proposed algorithms. Then thispaper uses two difference evaluation criterias to assess the two algorithms and severalother popular prediction algorithms. The assessment indicates whichever evaluationcriteria it is ideal to obtain higher accuracy than other prediction algorithms.
Keywords/Search Tags:miRNA, target gene prediction, svm, kernel matrix optimization, averagedensity
PDF Full Text Request
Related items