Font Size: a A A

Method And Similarity Analysis Of RNA Secondary Structures Associated Representation

Posted on:2014-07-03Degree:MasterType:Thesis
Country:ChinaCandidate:J Y WuFull Text:PDF
GTID:2260330425959106Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Ribonucleic Acid (abbreviated as RNA) is one of the most significant biological macromolecules, which is involved in protein synthesis, has enzyme catalysis, carries genetic information, and plays a extremely critical role in the life processes. With the deepening of the researchers on RNA research, more and more people recognize that the structure and function of RNA is no less than the importance of DNA for genetic, even more important than DNA. In order to determine the biological function of the RNA molecules, comparing RNA primary sequence and secondary structure of RNA is becoming very important research topic, so the similarity analysis among the RNA secondary structures has become one of the hot topics of this subject field. However, RNA secondary structure or tertiary structure information is not easy to get. Because the method of biological experiments is costly, it is difficult to popularize and promote. Therefore, many researchers employ bioinformatics method, and make use of computer science and technology to store and analyze RNA data to predict RNA secondary structures, and achieved good results. So, a variety of RNA secondary structure prediction software came into being.The research content of this paper is mainly to study RNA secondary structure comparison and similarity calculation in the case of RNA secondary structures known. Firstly, we introduce the background and significance of the selected topic as well as domestic and foreign research status. Secondly, RNA secondary structure similarity calculation methods and the clustering algorithms are expounded. The similarity calculation methods are mainly discussed based on string comparison, tree topologies, the graphical representation and the L-Z complexity. On the basis of discussed a variety of RNA secondary structure representations, in order to conveniently obtain required CT files, we present the transforming algorithm between dot-bracket notation and CT file representation. Some representations of RNA secondary structure using graphs only consider the structure of information and do not consider its carrying biological information. So, to overcome the existing shortcomings, a new representation method is proposed. The representations takes full account of the structure and carried biological information of RNA secondary structure, called semantic-structure graph. Based on the semantic-structure graph we computed the similarity of RNA secondary structures by finding the largest common sub-path and computing the similarity between them. In this paper, we used10Rfam families which include more than two thousands mixed RNAs as experimental datasets, calculated similarity to get similarity matrix, and hierarchical clustering algorithm is employed for clustering the RNAs. Experimental results have shown the effectiveness and attractiveness of the proposed method.
Keywords/Search Tags:RNA secondary structure, transform algorithm, semantic structure graph, similarity analysis, hierarchical clustering
PDF Full Text Request
Related items