Font Size: a A A

Swarm Intelligence Algorithm And Itsapplication In Gene Expression Data Clustering

Posted on:2012-12-12Degree:DoctorType:Dissertation
Country:ChinaCandidate:W ChenFull Text:PDF
GTID:1488303362497914Subject:Light Industry Information Technology and Engineering
Abstract/Summary:PDF Full Text Request
The Swarm Intelligence algorithm is a novel computing method for solving optimization problems. Since it is proposed in the 80s of the last century, it has been attracted extensive attention by many researches and is becoming a research focus in the field of optimization technology at present. The Swarm Intelligence algorithm refers to a class of heuristic search methods that can solve the specified problems based on collective behaviors.As a typical swarm intelligence algorithm, Particle Swarm Optimization (PSO) was proposed by Kennedy and Eberhart in 1995. It is motivated by their research results on social behavior of bird flocking and the Frank Heppner's biological swarm model is concerned. PSO is characterized by its simple computation and easy implementation with fewer control parameters, but it is not a global convergent algorithm. Based on the study of the convergence behavior of particles in PSO and inspired by quantum mechanics, Quantum-behaved Particle Swarm Optimization (QPSO) is proposed, which is a new PSO model characterized by much fewer parameters, much faster convergent speed and much stronger global convergent ability than the PSO algorithm.The paper focuses on the QPSO algorithm. Firstly the basic theory of the algorithm is studied in detail and the improved methods are proposed. Then the QPSO algorithm is applied in the fields of gene expression data cluster analysis, which is a research focus and difficulty in bioinformatics. The clustering of gene expression data is ascribed as an optimization problem in our work and the clustering algorithm is designed based on the QPSO. The new clustering algorithm shows excellent performance and opens up a novel method and idea for gene expression data analysis. The main contents of this dissertation are as follows:(1) Firstly, the background of our research, icluding the status of the swarm intelligent algorithms and the gene expression data analysis, are reviewed in detail. The main research topic which is the QPSO algorithm and its application in gene expression cluster analysis is prpposed. On this basis, the research objective and significance of this subject are presented. Then the foundation of the research, namely the basic theory of the PSO, and the main improvements of the PSO are elaborated.(2) After the principle of the QPSO is introduced, the comprehensive learning QPSO (CLQPSO) algorithm is proposed to solve the premature problem in original QPSO. In CLQPSO, all the personal pbest positions of the particles are used to update the local attractors. This new strategy maintains the diversity of the population so as to prevent the premature convergence of the particles. Detailed discussions about the parameters selection of the CLQPSO algorithm are provided and the empirical settings of those are determined according to the simulations. For investigating the convergence performance and the optimized ability of the CLQPSO, verifying the superiority of the algorithm, 8 representative PSO and QPSO models including the CLQPSO are choosed to conduct the numerical simulations. The results show that the CLQPSO is able to find better solutions on most test functions, especially on the multimodel problems. It is a global convergent optimization algorithm with high convergent precision and fast convergent speed.(3) Some basic theories about gene expression data cluster analysis, including the gene expression array, the preprocessing of gene expression data, the similarity measure of gene expression data, the description of the cluster problem and the cluster validation, are simply introduced. Two QPSO gene expression data clustering algorithm based on cluster centroids coding and cluster label coding are proposed and tested on 6 selected gene expression data sets. The results of the simulation are fully discussed.(4) The Binary QPSO (BQPSO) algorithm which is specifical for solving the discrete optimization problems is deeply studied and the comprehensive learning BQPSO (CLBQPSO) algorithm which introduces the comprehensive learning strategy into the BQPSO to replace the crossover operator used in the local attractor updating is proposed. The results of the simulation show that the new learning strategy is able to effectively improve the global convergence of the algorithm. Then the CLBQPSO clustering algorithm based on cluster label coding is proposed and is applied to analyse the gene expression data.(5) As the clustering algorithms proposed above need to pre-defined a cluster number which is not able to be adaptively adjusted in the clustering process, two dynamic clustering algorithms based on QPSO are poposed. In the first QPSO automatic clustering (QPSOACC) algorithm, the special coding method which adds a group of thresholds in each particle to control the activation of the corresponding centroid is applied. Only those activated centroids in each particle are choosed to partition the data set during the clustering process. The second QPSO dynamic clustering (DCQPSO) algorithm initialized a set of cluster centroids at first. Then the particles of the BQPSO are used to select the activated cluster centroids and the optimal combination of the centroids is determined through the BQPSO updating. At last the optimal cluster result is obtained when the K-means clustering is implemented in the end of the clustering process. After the simulation is conducted to verify the effectiveness of the two algorithms, they are applied to solve the problem of gene expression data clustering.Finally, the main research work and the obtained results are summarized in the end of the dissertation. Moreover, the prospective of future research is discussed.
Keywords/Search Tags:Optimization problem, swarm intelligence algorithm, particle swarm optimization, quantum-behaved particle swarm optimization, microarray, gene expression data analysis, clustering, dynamic clustering
PDF Full Text Request
Related items