Evolutionary Computation Based Maximum Similarity Biclustering And Application

Posted on:2014-03-05

Degree:Master

Type:Thesis

Country:China

Candidate:X J Peng

Full Text:PDF

GTID:2268330425483703

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

Gene expression data produced by gene chip experiments has a huge scale, whichtypically contains thousands of genes and hundreds of samples. Thus, the geneexpression data has characteristics of the high dimensions and large data volume.Simultaneously, because of the complexity of the individual organisms, geneexpression level may have great difference, may also be highly similar, which aredisorderly dispersed. These data hides behind the great information, so it needs tomine the gene expression data for discovering this hidden information. Biclustering isa good analysis tool for gene expression data. Comparing to the traditional clustering,biclustering can dig out much similarity and biologically meaningful information. Soin this paper, some works of biclustering for gene expression dada have been done.The main works have the following points:Firstly, this paper studies about the types, the structures of the biclusters and thesearch strategies of biclustering algorithms, analyzes the characteristics ofmainstream biclustering algorithms, explores the evolutionary computation basedbiclustering algorithm model, and illustrates some proposal for improvement.Secondly, the main work of this paper is to propose a evolutionary computationbased maximum similarity biclustering for gene expression data. The algorithm firstuses feature selection algorithm to select some columns as reference conditions fromgene expression data, then convertes the data matrix based on reference conditions,followed gets the similar matrix according to the reference genes, and finally uses theevolution algorithm, initializing the population according to the binary encode rules,to iterate until the evolution finished and obtains a best individual. Some bestindividuals meeting some certain conditions are decoded into a biclustering, and thensaving them in the results. The final output by the algorithm is a set of biclusters.Finally, some contrast experiments on some expression data have been done totest the performance of the algorithm. The first kind of data is synthetic data sets. Thesecond is two gene expression data sets of yeast. The third is the gene expression dataof cancer. The paper gives the biclusters from these gene expression data a score assome rules, comparing the results. It shows that the algorithm of this paperoutperform some other algorithms. In addition, the experiment result of the third datashows that this algorithm can do a good job on cancer classification.

Keywords/Search Tags:

Gene expression data, Evolutionary computation, Biclustering, Maximum similarity bicluster, Similarity matrix

PDF Full Text Request

Related items

1	Research On Biclustering Algorithms For Gene Expression Data
2	The Research Of Genetic Algorithms For Biclustering On Gene Expression Data
3	Research On Multi-Objective Evolutionary Computation For Biclustering In Microarray Gene Expression Data
4	Analysis Of Gene Expression Data Clustering Algorithm
5	The Research On Biclustering Algorithm Applied To Gene Expression Data
6	Studies On The Biclustering Algorithms For Gene Microarray Data
7	Research On Biclustering Methods For Gene Expression Data Analysis
8	The Design And Implementation Of Bicluster Data Analyzing Software
9	Application Of Improved Biclustering Method To Cancer Gene Expression Data
10	Biclustering Analysis For Gene Expression Data