Font Size: a A A

Research On Motif Finding Algorithms Based On Maximum Cjique Refinement

Posted on:2013-03-21Degree:MasterType:Thesis
Country:ChinaCandidate:W NiuFull Text:PDF
GTID:2248330395455619Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Transcription factor binding sites are DNA sequences which bind the transcriptionfactors, they start the gene transcription and control the efficiency of gene transcription.Because transcription regulation is a key step in the regulation of gene expression, sothe prediction and recognition of transcription factor binding sites has an importantsignificance for the study of biological regulatory networks.Researching on the existing motif finding algorithms, this thesis finds thecorrelation between the implanted motif finding problem and the maximum cliqueproblem, then formalizes the implanted motif finding problem as solving the maximumclique problem of an undirected graph, and gives an algorithm to solve the implantedmotif finding problem by using the maximum clique. In order to reduce the scale of thesolution space and accelerate the convergence speed of the maximum clique problem,this thesis proposes a random projection implanted motif finding algorithm based onmaximum clique refinement. This algorithm generates some qualified buckets by usingrandom projection strategy, and can get a set of candidate motifs by exerting amaximum clique expand refinement process based on backtracking method on all thequalified buckets, and finally outputs the motif which has a highest objective functionscore as the optimal solution.This thesis analyzes and gives the random projection parameters selection’sinfluence on the random projection implanted motif finding algorithm based onmaximum clique refinement. Both consensus score and relative entropy are used tochoose the optimal solution, the optimal solution quality is evaluated by someevaluation parameters, such as, the performance coefficient and the correlationcoefficient. This thesis uses the simulated data to verify the validity of algorithm, andapplies the algorithm to predicting the transcription factor binding sites of differentspecies of biological data, particularly, this algorithm can achieve an accurate rate above80%when it predicts multiple groups of transcription factor binding sites of theSaccharomyces Cerevisiae.
Keywords/Search Tags:Implanted motif finding, Random projection, Backtracking method, Maximum clique refinement, Performance coefficient
PDF Full Text Request
Related items