Font Size: a A A

Bi-correlation Pattern Discovery Of Gene Expression Based On Bi-clustering Method

Posted on:2011-08-20Degree:MasterType:Thesis
Country:ChinaCandidate:J D ShenFull Text:PDF
GTID:2178330338981782Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Analysis of gene expression data for biology and bioinformatics is very important, especially in terms of gene function and the associated research is particularly impor-tant. The traditional clustering method uses the similarity measure function to deter-mine the similarity of gene and to group the gene. However, in many cases, clustering of genes need to identify the part of the experimental conditions, like the consistent increase or decrease the expression of the genome. That is, while clustering the gene, clustering the experimental conditions, this leads to Bidirectional clustering.δ-Bicluster algorithm is a classical biclustering algorithm which based on the MSR (minimum mean square residual) and iterative greedy search strategy theory. After analyzing the advantages and disadvantages of the algorithm, in this paper, we propose a new biclustering algorithm called ProBicluster. This algorithm focus on two aspects: a biclustering pattern problem solved by bi-graph theory to find linear pattern; cross-clustering problem solved by the re-biclustering. And algorithms are tested on the artificial data sets to illustrate the effectiveness of ProBicluster. A data analysis platform was also implemented based on ProBicluster.Bi-correlation analyses gene and condition on the yeast cell cycle data sets. Pro-Bicluster and the other four clustering algorithms applied to the data set, access and use of genetic evaluation for biclustering results comparison and analysis of clustering results. Score results show that, ProBicluster algorithm score correlation coefficients were higher than several other algorithms, coefficient of restitution par score and OPSM algorithm. As a whole, ProBicluster algorithm has good accuracy and effec-tiveness.Bi-correlation analyses gene and drugs in NCI60 data set. The biclustering results analyzed from both perspective of gene and drug. Gene analysis use the pathway and, drug analysis is from the class of drugs, physical and chemical properties. After veri-fication of documents ,we found in the sub-model of a drug Cisplatin and gene CCND1 were resistant, indicating that biclustering algorithm is useful to find genes associated with the effectiveness of drug, to further help the drug discovery and drug design.
Keywords/Search Tags:Bi-cluster, Bi-correlation, Gene Expression, Drug Activity
PDF Full Text Request
Related items