Font Size: a A A

Research On Co-regulated Gene Mining Algorithms

Posted on:2010-10-23Degree:MasterType:Thesis
Country:ChinaCandidate:T BaiFull Text:PDF
GTID:2178360272997573Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Regulate the activities of life are an important mechanism,it plays an important role.However,the relationship between organisms is a complex regulation;even the simple model has tens of thousands of biological role of the relationship between regulation and control.The study between gene regulation and control mechanisms, not only more in-depth understanding and prediction of gene function,but also systematically describes the process of life activity.The ongoing study are looking for co-expression genes,it co-exist very different regulatory genes.A total of regulatory genes are at least known by a common transcriptional regulatory factor modulation of a set of genes,so the researchers can directly look for a total of gene regulation that is very necessary.Complex network theory in the aggregation coefficient used to measure the neighbor nodes of the close contact between the degrees of.Cells,protein-protein interaction network of networks,gene regulatory networks,and metabolic networks have a high coefficient of the average aggregation,indicating that the high-aggregation are the essence of biological characteristics of a network. Aggregation factor is a potential modular network logo,in the actual biological systems;we can generally observe that the modules exist.Coordination refers to the operation of modules in order to achieve independent function or on a group of physiological functions of the contact node photogenic.At present,people do some study about the division of the module and functional analysis.According to the definition of modular genetic regulatory networks can be divided into two categories: that based on DNA microarray expression data module and the transcription regulation network topology that based on the functional modules.In this paper,these two types of regulatory networks to study gene regulation and control algorithm for a total network of functional modules and control methods of excavation.First of all, the definition of a sliding window-based similarity matching SMSM that is able to take into account all four genes were among the relationship between regulation and control;And then,an improved global K-means clustering algorithm AGKM gene expression data based on total control of gene clustering,and the results with the traditional algorithm obtained good results;Finally,overlapping structure of the community Mining filtering algorithm CPM faction of gene regulatory networks function in the excavation,the gene Notes database query module found most of the genes involved in the same biological processes.In this paper,the specific part of the contents includes the following:1.Gene regulation and control relationship between the total,including both positive and negative total control,differences in the expression of total control,total control delay and part-time total control of four,at present it has a total similarity measure of gene regulation and control methods can take into account total control over the relationship between all four.This article defines a sliding window-based similarity matching(Sliding Match based Similarity Measure,SMSM),matching the positive and negative by constructing a sliding window model to measure between two genes at a total consideration of all four cases the relationship between regulation and control of the similarity.And,correlation coefficient as the following two parts as the basis of the work,in order to measure the degree of similarity between genes (distance) standard.2.K-means clustering algorithm as a traditional method based on iterative clustering,there are still lot of self-recognized lack of need such as the number of pre-specified clustering algorithm often the end of local optimum,or to vary significantly with the isolation Clustering of data points,the effect of poor initial selection of more sensitive and so on.In this article,use an adaptive global K-means algorithm(Adaptive Global K-Means,AGKM) to solve these problems.The basic idea AGKM algorithm are:First,randomly select a sufficient number of individuals (as K) of the Center as the initial operation of a traditional K-means algorithm,K be a cluster of local optimal solution,obtained for a poly K centers for the().And then, from the K centers in the selection of a phase-out to the remaining K-1 centers for the initial centers,and then run a "K-1 means" algorithm.Iteration followed by n-out point of the motion,run n times "K-n mean" algorithm,until the termination of the conditions to meet the adaptive algorithm and the termination of a cluster K-n Algorithm for each iteration of the core is a central pre-select out.Random selection algorithm because of the initial multi-center enough quantity so that the sample contained a small number of "small cluster" and has a greater probability of containing the initial centers,but "large cluster" also may contain more than one of the early before the center.With the running a number of K-means,this "big cluster," is divided into several "parts of clusters." Iterative algorithm for each of the selection is relatively small "part of cluster" out of the center.Through the data sets on the real yeast regulatory gene cluster for a total experiment,the experimental results obtained with CLUSTER3.0 algorithms to verify the algorithm is feasible and effective.3.In this paper,compiled by the Saccharomyces cerevisiae transcriptional regulation organize data,obtained from 3562 genes and 7074 between them constitute the relationship between regulation and control of the control network,The use of community-based filtering algorithm for mining factions(Clique Percolation Method, CPM) combined with the similarity of the first part of SMSM to identify the gene regulatory network function modules.Algorithm for gene regulatory networks will be divided into nine functional modules,Notes in the gene in the SGD database search of known genes in the module in biological significance;we found that the module genes are the same or similar process of biological functions.So we know that the application of overlapping community structure mining algorithms to better identify the gene regulatory network function modules.The Analysis and founding about the genetic relationship which from the total control that has a great theoretical and practical significance.This paper studied the clustering of total gene regulation and gene regulatory network function modules excavation method,organisms are found in the structure of the complex relationship between regulation and control functions similar or related genes in important ways to follow-up question for this research has great and development prospects and practical significance.
Keywords/Search Tags:Bioinformatics, Regulatory Network, Clustering, Community Mining
PDF Full Text Request
Related items