Font Size: a A A

A Study Of Biclustering Based On Intelligence Optimization

Posted on:2015-01-10Degree:MasterType:Thesis
Country:ChinaCandidate:X L TianFull Text:PDF
GTID:2308330464968721Subject:Circuits and Systems
Abstract/Summary:PDF Full Text Request
Clustering technology is a significant tool in data mining, which can partition dataset into some clusters according to a given criteria in order to discover the distribution of data structure and get a further understand for the data. Up to now, a large number of clustering technologies have been proposed. With the rapid development of the microarray technology, the number of the gene expression data is bigger and bigger. For high-dimensional data, the similarity measures used in the traditional clustering algorithms are not meaningful. High-dimensional data poses new challenges for clustering algorithms. Biclustering is a very useful tool for mining gene expression data, which aims at finding subgroups of genes show highly correlated behaviors across a subgroup of conditions. The main works in this thesis are summarized as follows:1. A summary is given on the traditional clustering and biclustering technology. We first discuss the classification of traditional clustering algorithms and introduce several traditional clustering algorithms. We then analyze the insufficient of traditional clustering algorithm in mining high-dimensional data, and introduce the biclustering into the high-dimensional data analysis. We introduce the relative concepts of biclustering and some classic biclustering algorithm in detail. A discuss on the classification of the search strategies used in the biclustering algorithm is given. In recent ten years, swarm intelligence optimization has been used widely in gene expression data analysis. We then discuss why the Swarm intelligence optimization can be used in gene expression data analysis.2. A new algorithm called pattern-driven binary Particle Swarm Optimization(PSO) algorithm is proposed by studying the application of PSO in gene expression data. PSO is a population-based intelligent optimization algorithm, which has an advantage of fast convergence.The PSO is easy to understand and has less parameter to control. The PSO gets a wide use in scientific research. The PSO is one algorithm based on evolutionary search. This kind of algorithm based on evolutionary search has a disadvantage of low efficiency, so some local search operators need to consider. Pattern-driven operator is a kind of local search operator by using potential trajectory information hidden in thegene expression data. By using pattern-driven operator, the search space reduces greatly, and the PSO improves its efficiency.3. A new multi-objective optimization used to bicluster the gene expression data is improved. The algorithm uses fast non-dominated sorting algorithm(NSGA-II) as framework, Cheng&Church and pattern-driven search operator as local search operator. In the analysis of gene expression data, several conflicting objectives exist, it is suitable for multi-objective optimization. NSGA-II is widely used and easy to understand. The experimental results show that the proposed algorithm obtains better results than the original algorithm.
Keywords/Search Tags:Biclustering, Gene Expression Data, Swarm Intelligence Optimization, Multi-Objective Optimization, Pattern-Driven Search Operator
PDF Full Text Request
Related items