Font Size: a A A

The Research Of Gene Selection And Clustering Method In Gene Microarray Data Analysis

Posted on:2013-03-18Degree:MasterType:Thesis
Country:ChinaCandidate:S N XuFull Text:PDF
GTID:2248330362971986Subject:Pattern Recognition and Intelligent Systems
Abstract/Summary:PDF Full Text Request
With the completion of the genome project continually, a vast gene microarray data to be analyzed and interpreted increases exponentially, so that we can understand the gene expression patterns on the molecular level and research the life phenomenon on the microscopic level. Gene chip technology is one of the main research fields of Bioinformatics. Through this technology, it can simultaneously detect the bioactivity of tens of thousands of genes, but it also can produce large amounts of Microarray data. Gene expression data sets are characterized by small sample, high dimension, large noise, abundant redundancy genes and lopsided distribution. It becomes a main field of pattern recognition and data mining study how to mining useful Biological information from this data and provide effective guidance for disease detection and typing. This paper mainly research some related aspects of gene selection and clustering methods based on microarray data, detailed work are listed as below.1. The paper proposes an improved Genetic Algorithm(GA) based on a neuron activation function. The main idea is to modify mutation and crossover operator by neuron activation function and consider the within-class distance and between-class distance.2. Taking full account of characteristics of microarray data and advantages and disadvantages of Genetic Algorithm (GA), the paper takes leukemia data set as research and simulates by the K-means clustering based on GA. The experimental results show that the method obtained the good classification effect and higher classification accuracy than the commonly used clustering algorithm, which is used in gene select of gene microarray data. And also it reduced the gene dimension.3. In considering the characteristics of gene microarray data, small sample and high dimensions, this paper proposes a method of gene selection, which combines particle swarm algorithm with clustering error rate. Then, it simulation by cancer data and it can get smaller and better classification ability of feature subsets.
Keywords/Search Tags:Microarray gene dataset, Feature Selection, Clustering, Particle SwarmOptimization (PSO), Genetic algorithm (GA)
PDF Full Text Request
Related items