Font Size: a A A

The Construction Of Gene Regulatory Network Based On The Rough Cluster

Posted on:2009-10-23Degree:MasterType:Thesis
Country:ChinaCandidate:W Z WangFull Text:PDF
GTID:2178360242980851Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the completion of Human Genome Project, the construction of gene function has become a new research interest in the post-genome era. Using microarray techniques, it is possible for scientists to discover regulatory relationships between genes. Gene regulatory networks, in definition, are the simulation or reconstruction of the mutual relations among expressed genes. Gene regulatory networks help people to understand which, where, when and how genes are expressed in organisms by observing visual model of gene expression. In this way, gene regulatory networks have been widely applied in the research on relations between genes and diseases or drug target designs.The reconstruction of gene regulatory networks is to build the genetic interaction model based on massive gene expression data, combined with some analysis and computational method to simulate system dynamic behaviors, which can take a sight of inter-dependent relationships between genes. Contrarily, the established model can direct the further biological experiments. Based on the crossing of subjects on molecule biology, nonlinear mathematics and information science, the research of gene regulatory networks has been an important field in post-genome era. The current construction of gene regulatory networks uses the existing model to analyze gene expression data and the relationship between genes. The present research is in a developing stage. In this stage various models are continuously improved, and the corresponding algorithms are emerging.Based on analyzing and summarizing the related work carefully, combined with biological and computer knowledge, this paper briefly introduces biological knowledge and behavior related to gene regulatory networks. This paper reviews main mathematic methods and models of gene regulatory networks, such as directed graphs, Boolean networks model, linear combination model, weight matrices model, mutual information networks model, correlation coefficient model, Bayesian networks, differential equations and so on. Every method not only has its own advantage but also some limitation, and we have not found any model which is perfect for genetic regulatory system. Gene expression is continuous, so continuous network models are generally used to analyze gene regulatory networks and reconstruct it. Weight matrices model is relatively widely applied to research gene regulatory networks. It can not only solve the problems that whether there are interactions between genes, but also describe the strength of interactions between each other with the form of weight.The analysis of gene expression profiles uses clustering method commonly, and its purpose is to group genes. In order to describe the reality better, this paper will cluster genes based on rough set. Each cluster has a lower and an upper approximation. The data in a lower approximation exclusively belong to the cluster. The data in an upper approximation may belong to the cluster. Based on it, this paper introduces a rough-based k-means algorithm and describes relevant definitions and processes of this algorithm. Data experiment from UCI machine learning databases validates that the rough clustering algorithm has a good clustering effect.This paper proposes an improved immune algorithm based on information entropy. The immune algorithm is a stochastic heuristic method, using the simple mutation operation. In order to improve the performance of the immune algorithm, the non-uniform mutation operator of genetic algorithms is introduced to the immune algorithm. The algorithm with the non-uniform mutation is compared with the original algorithm through the experiments. The non-uniform mutation operator improves the performance of the immune algorithm, and acquires relatively satisfactory experiment result.Finally this paper proposes a novel algorithm of constructing gene regulatory networks, combined with the rough-based k-means algorithm, the improved immune algorithm and weight matrices model. Using the rough clustering algorithm, it can identify each specific functional module and the genes which coordinate those modules. The weight matrices model and the improved immune algorithm learn the gene regulations of each module.Because the modules include those genes that play the coordinating role, it can get the whole gene regulatory network from local network. This method not only has the advantage of using that the weight matrices model analyzes the gene chip data, but also reduces the relation which violates the local characteristics and hierarchical characteristics of network. Because the quantity of gene data set is generally quite huge, it may reduce the time complexity to study gene regulatory network after clustering genes. The experiment of yeast cell cycle data demonstrates the effectiveness of this method.This paper is just based on basic research and experiments. The improved algorithm also has its own deficiencies, for example, how to choose three parameters ( threshold , elow ,eup) of the rough-based k-means algorithm value. This paper compares three parameters for different values, and obtains the reasonable result. It is also worth discussing whether it is possible to establish other rough-based cluster algorithm. The improved immune algorithm is a stochastic heuristic method. Although this kind of algorithm has a strong capability of global optimization, the specific form of algorithm has a very significant influence on its performance. The rough-based k-means algorithm, the improved immune algorithm and weight matrices model construct the gene regulatory networks of yeast. The quantity of the yeast cell gene data set is quite huge. Although the time complexity has been reduced a lot compared to other methods, because of the large number of parameters and computer calculation capacity constrains, the time complexity is still relatively high.In the meantime, the gene expression data which infer gene regulatory networks can not supply complete information. It means that the gene expression data only includes partial information of the network, and we can not infer the correct result only according to the gene expression data. New research direction is to infer gene regulatory networks using various data sources and biological information, so it can conform to the real biological regulatory networks.Certainly, because the research of gene regulatory networks is still in the exploratory stage, many aspects need further discussion such as the design of algorithm and model. It is worth us making more endeavors.
Keywords/Search Tags:Construction
PDF Full Text Request
Related items