Font Size: a A A

Research On RNA Secondary Structure Prediction Algorithms

Posted on:2013-06-26Degree:DoctorType:Dissertation
Country:ChinaCandidate:C XingFull Text:PDF
GTID:1228330395459630Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In recent years, more and more research shows that RNA plays a very importantrole in the life process. The RNA molecules are not only the carriers of geneticinformation in living cells,but also has some other important functions,such asregulating the gene expression, catalysising mRNA splice,processing and modifyingthe precursors of RNA and so on. So the research work about the RNA molecules isalways one of the important fields in the bioinformatics. The function of RNAmolecules has very close relation with their stuctures. In order to make furtherexploration, we need with the aid of RNA secondary structure.The most accuratemethod can be use by X-ray diffraction or nuclear magnetic resonance, but this isdifficult because not only it is expensive and slow but also most RNA molecules cannot be crystallized currently. Therefor, the recognized main method is by usingcomputer to realize all kinds of algorithms.In this paper, we study the methods of the RNA secondary structure prediction indepth. They include: the methods based on thermodynamic energy minimizationprinciple (such as Zuker’s mfold mehod, base pair maximization algorithm el.), themethod of Comparative sequence analysis (such as covariance mutation predictionmodel, stochastic context free grammar algorithm), the heuristic algorithm (geneticalgorithm, Simulated Annealing) and so on. Through the research of those methods,wesum up their respective advantages and disadvantages,and found the research idea ofthe new prediction method,which has laid a solid theoretical foundation for thecompletion of the paper work.Firstly, we use least squares support vector machine (LS-SVM) on the basis ofprincipal component analysis to predict tRNA. Due to the equality constraints in theformulation, a set of linear equations has to be solved instead of a quadraticprogramming problem. Compared with the traditional support vector machine (SVM),LS-SVM converted the inequality constraints into equality ones and made the trainingof the SVM equivalent to solving a group of equalities.Principal component analysis (PCA) is commonly used for feature extraction from high dimensional data. We usedthis approach to analyze the statistical features of nucleotide sequence, and then use theLS-SVM to predict the ncRNA.The results indicate that the proposed method isadoptable for prokaryotic ncRNA prediction.Secondly, we propose PSOfold, a particle swarm optimization for RNA secondarystructure prediction, to improve the performance of the recently published IPSO. Toenhance the searching ability of optimal solution, fuzzy logic control is applied toadaptively adjust the PSO parameters, which are inertia weight, learning factors and thenumber of ants, respectively.Finally, to further settle the stem permutation problem, we put forward a solutionconversion strategy to transform the discrete values of stems into an ordered stemcombination. The experimental results show that our method is effective for RNAfolding in terms of sensitivity, specificity and F-measure by comparing with othermethods based on evolutionary algorithms and swarm intelligence algorithms.
Keywords/Search Tags:RNA secondary structure prediction, Particle swarm optimization, Fuzzy logiccontrol, Principal component analysis, Least squares support vector machine(LS-SVM)
PDF Full Text Request
Related items