Font Size: a A A

Research On Gene Cluster Hybrid Algorithm Based On Particle-Pair And Extremal Optimization

Posted on:2012-10-16Degree:MasterType:Thesis
Country:ChinaCandidate:J B XuanFull Text:PDF
GTID:2218330338973125Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the accomplishment of human genome project, life science research enters into the post-genome era, and the key point of research has changed into determining each gene's function in organism as well as the relations of interaction and regulation among genes. As the most basic experiment method of function genome study in the post-genome era,gene-chip each experiment can simultaneously monitor the expression of thousands of genes under different experimental condition,which thus has generated massive gene expression data containing gene activity information. How to analyze and handle these gene expression data, digging out valuable biology and medical information towards human,has become a hotspot of concern and study in the post-genome era. At present,cluster method is one of the major computation technologies for analyzing and handing gene expression data. Clustering the gene expression data can classify the genes of similar or same expression patterns into a category,which is helpful to synthetic research on gene function,gene regulation,cell process and cell hypotype and so on,and has vital practical significance in suppling unknown genes'biology function annotation,clinical diagnosis treatment and so on. Therefore,massive domestic and foreign scholars have proposed all kinds of clustering algorithm applied to gene exression data clusting analysis. As one kind of novel gene clustering algorithm,Particle-Pair Optimization(PPO) has obtained better clustering effect in some gene expression data sets,but it also has some questions to be solved. This article carries on research around how to further enhance the cluster effect of PPO algorithm,and main related research work is as follows:(1) We simply introduced the related elementary knowledge of Bioinformatics,then detaily analyzed the obtaining,expression,preprocessing,cluster analysis principle and cluster result validation of gene expression data,finally gained two group of gene expression data sets which were used to cluster analysis experiment in this article.(2) We simply analyzed the principle of two traditional cluster algorithms that K-means and hierarchical clustering,then introduced the principle of standard particle swarm optimization algorithm(PSO) and analyzed the principle,merit and drawback of particle swarm cluster algorithm,finally detaily elaborated the principle,cluster flow and characteristic of PPO algorithm.(3) Based on studying the basic PPO algorithm thorougher, we analyzed three questions to be solved for the PPO algorithm,and proposed three corresponding improved threads:initializing a particle with the fast cluser result of K-means,introducing a sharing strategy of best information between the initial particle-pair,using different velocity evolution formula for the particle that belongs to different category according to the statistical information of particle-pair,which formed an improved particle-pair algorithms called ImPPO. Finally,in order to validate the effectiveness of improved schemes and ImPPO,we used three gene expression data sets to do the experiments of cluster analyzing. The experimental results indicated that improved schemes and ImPPO had better cluster effect than K-means and basic PPO algorithm in som gene expression data sets,and again showed that different cluster algorithm,even same cluster algorithm with different parameter might produce different cluster result in the same gene expression data sets.(4) Based on analyzing the principle and characteristic of basic Extremal Optimization(EO) algorithm,we proposed a new gene cluster hybrid algorithm called PPO-EO by combining the merits of PPO and EO algorithm.The extremal optimization algorithm was introduced in the iteration process of elitist particle pair according to interval iteration. On the one hand,PPO-EO used the merit of EO with power local search capability to avoid PPO falling into local optimization premature in the later period,on the other hand,it used the merit of PPO with ensuring convergence to overcome the drawback of EO without ensuring convergence. PPO-EO completed gene cluster by performing the merits of PPO and EO,which could improve the cluster precision. Finally, in order to evaluate the effectiveness of hybrid algorithm,we used the other three gene expression data sets to do the experiments of cluster analyzing. The experiment results indicated that hybrid algorithm was better precision than K-means and PPO algorithm in three cluster evaluating indicators that mean square error,homogeneity and separation.
Keywords/Search Tags:Bioinformatics, Gene Clustering, Particle-Pair Optimization, Extremal Optimization, Hybrid algorithm
PDF Full Text Request
Related items