Font Size: a A A

Study On Sequential Motif Discovery Algorithms Based On Subgraph Density

Posted on:2010-05-08Degree:MasterType:Thesis
Country:ChinaCandidate:Q M HouFull Text:PDF
GTID:2178330332487671Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Along with the human genome project's beginning and the modem biological technology's rapid development, the biology message data's growth presents potential of an explosion, which provided data foundation for opening the mysteries of life;The enhancement of computing capabilities and the development of the Internet provides large-scale data's storage, processing, retrieval and interpretation with a theoretical basis. But the purpose of bioinformatics is how to use the information science and computation technology's method, through the data analysis and processing, to reveal the inner link during mass data and the biology meaning, explain the structure and the functional information they contain, then extract useful biology knowledge. Motif is an expression of life password, motif discovery is a basic method to reveal the biological significance contained in biological sequence data, also it's an important research problem in bioinformatics. Motif discovery is an NP-complete problem. People have already explored a number of effective algorithms, but these algorithms have some limitations more or less. With the continuous expansion of the size of the data and new problems are emerging, many algorithms have been unable to adapt to the needs of the problem. Therefore, to study more effective algorithms has become a major issue in motif discovery in biological sequences, and receives more and more attention.In this paper, we firstly analyzed the models used in motif discovery algorithms. Also, we studied the motif discovery algorithms based on different models. Based on this analysis, we provided an exhaustive search algorithm based on Maximal Density Subgraph (MDS). Graph was constructed using the input sequences, in which the vertices represented all 1-mers in the sequences, and edges represented pairwise 1-mers similarity. Then motif discovery problem can be transformed to the problem of finding the MDS in the graph. The algorithm used PWM model to find the motif. The theoretical analysis and the experimental results on the synthetic and real data show that this algorithm can detect the motif efficiently, and it is a good solution to the challenge problem (15,4).
Keywords/Search Tags:motif discovery, sequences, graph, maximal density subgraph
PDF Full Text Request
Related items