Font Size: a A A

Study On Motif Clustering Algorithms In Bioinformatics

Posted on:2012-03-07Degree:MasterType:Thesis
Country:ChinaCandidate:C ChenFull Text:PDF
GTID:2248330374489884Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Currently, in bioinformatics, clustering algorithms have already been widely used in motif-finding, however, some of those approaches are doing well when binding sites are significantly different from their background sequences. But, in fact, some motifs may have a relatively similar nucleotide distribution to that of their background sequences. So, under this situation, it is worthwhile enough to test and compare those methods via using real and synthetic datasets.This paper firstly introduces a number of tools aiming to cluster datasets. Moreover, I analyzed and compared their performance by using synthetic and real datasets and a series of testing strategies. All the tests focused on their executing speed and accuracy of their results. I did extensive preparatory work before carrying out the tests. All the data selected was hand-picked purposefully. To do this successfully, I designed an integrated and detailed procedure for selecting. I gave a relatively detailed introduction of those methods, as well as the tests. The experiments aiming at comparing them used the various datasets, including those have a relatively similar nucleotide distribution to that of their background sequences. By doing so, it made the all tests more close to the realistic condition. Importantly, those motif would be missed by some tools.After finishing the tests, which employed both synthetic and real datasets for motif-finding, I discovered that each of algorithms has its own advantages in some areas. Among them, CliClustering showed an outstanding performance, in particular, for balancing the prediction sensitivity and specificity in general. It is more likely to identify the binding sites than the other tools when the distribution of nucleotides of binding sites is similar to that of their background sequences. Moreover, it is well complementary with other tools.
Keywords/Search Tags:clustering methods, bioinformatics, motif
PDF Full Text Request
Related items