Font Size: a A A

The Research Of Improved Clustering Algorithm For RNA Secondary Strunture

Posted on:2015-12-08Degree:MasterType:Thesis
Country:ChinaCandidate:H T YanFull Text:PDF
GTID:2180330422971019Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
RNA is a very important molecule in biological system that plays a very importantrole in the biological process of genetic, especially for the virus. For example HIV, thegenetic material is DNA but not RNA. RNA is also involved important life events, such assynthesis, cell differentiation and metabolism of proteins. Now RNA secondary structurehas also become the object of attention of researchers while more and more scholars studyRNA molecules, as the function of biological molecules usually has a great relationshipwith its structure. In bioinformatics, RNA secondary structure prediction method mainlyuses the free energy, but it is a difficult problem how to select real RNA secondarystructure from most RNA secondary structure of minimum free energy structure. Thereis a commonly method used in biosphere that clustering the sub-optimal structures ofRNA. This paper is improvement two clustering algorithms what use to cluster RNAsecondary structure.Firstly, we have been studied the traditional hierarchical clustering algorithm forRNA secondary structure clustering and proposed the efficiency low of this clusteringalgorithm and improve the traditional hierarchical clustering algorithm. We can dividethe algorithm into three parts: first, calculate the distance of the RNA secondary structureswith RBP scores, get a distance matrix; second, calculate the distance between any tworows from matrix, and sort the distance from small to large; third, cluster the final sorteddistance.Secondly, we have been studied the traditional k-means clustering algorithm andproposed improve selection the central value of traditional k-means clustering. Thealgorithm first calculates the centroid of each cluster of the improved hierarchicalclustering algorithm,and as initial object; then calculates remaining distance between theobjects to the original object, according to the principle of minimum distance, andclassified the objects to the nearest cluster; last, adjustment clustering result, until thecluster centroids is not change longer. Finally, the above two methods of improving cluster are experimented, and to analyzethe result.
Keywords/Search Tags:secondary structure, distance matrix, hierarchical clustering, k-means clustering
PDF Full Text Request
Related items