Font Size: a A A

Research On Clustering Algorithm Based On Binary Graph

Posted on:2016-11-07Degree:MasterType:Thesis
Country:ChinaCandidate:L HouFull Text:PDF
GTID:2208330470950653Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the study of computational biology in recent years, gene chip technology and geneexpression chip come out. We can measure many gene expression and produce on a large scalegene expression data. We can extract data from gene, and can see many useful conclusions fromthese huge data. Data analysis techniques take a crucial role in it.Cluster analysis is a very important way to study the technical data. The traditional methodof clustering in the direction of the row or column is processed, but some elements are affectedby the double impact of some genes by rows and columns. Therefore proposed to cluster fromtwo directions of rows and columns, we called the biclustering algorithm. The main function isdivided into a set of data which has a relatively high degree of similarity between sets of dataand the degree of similarity between the different types of data is relatively small. Geneexpression data is divided by a high similar degree clustering of classes, which for a disease ortrait may have common expression, thus helping explore gene mysteries. Nowadays, theclustering of gene expression data for research has become very common. However, there areinsufficiencies of algorithm and not easy to solve some problems. So the research on clusteringalgorithm has a very important significance.This article describes a biclustering algorithm which based on divide and conquers: BIMAX.The algorithm has been proven to be effective within the allowed time to find out all therequirements of the size of the biclustering, and provides a relatively better basic algorithm. Thismethod uses is0-1matrix model, it is a simple grouping by column matrix, and then movingrows to simply divide the matrix and discussing the matrix which is selected by overlapping,then finally get the clustering. However, the algorithm but there is some drawbacks.This paper take the BIMAX algorithm first division of the characteristics of the columns,useing K-means clustering method, CC algorithm residuals and And a clustering decision criteria:Gain Value, starting from the original matrix preprocessing algorithm processing has to makeBIMAX column matrix of the adjusted data, so BIMAX algorithm begins dividing the columnset provides a high degree of polymerization, which can find the biclustering more quickly,thereby improving the rate of BIMAX algorithm when processing matrix.Based on the experimental results of the improved algorithm we get, improved algorithmsBIMAX comparison with the original algorithm BIMAX, the time which the output of all thedemands of faster biclustering size in the same matrix and the time which need of was reduced.
Keywords/Search Tags:biclustering, BIMAX algorithm, gene expression
PDF Full Text Request
Related items