Font Size: a A A

Research On Classification Algorithm Based On Tumor Gene Expression Profile Data

Posted on:2018-01-20Degree:MasterType:Thesis
Country:ChinaCandidate:D H YuFull Text:PDF
GTID:2334330542486992Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of gene chip technology,more and more tumor genes can be determined,and gene expression data's acquisition is also more and more simple,which makes the analysis of tumor gene expression profile data has become a conventional step of tumor diagnosis and classification.Because of the high dimensionality,low number of samples and category imbalance of the original gene expression data,the gene expression data are difficult to be used effectively.The high-dimensional features of gene expression data can be avoided by using some appropriate dimensionality reduction algorithms to remove a large number of redundant genes in gene data,and using the extracted low-dimensional data to represent the original gene expression data.The problem of category imbalance in gene expression data refers to the imbalance problem of classificatory data distribution in samples.This imbalance will make it difficult for classifier to classify small category samples effectively,which will limit the performance of the whole classifier.In this paper,we propose an Improved Weight Extreme Learning Machine(WELM)based on the Extreme Learning Machine to construct a classification model to classify the tumor gene expression data,so as to achieve the goal of improving the classification of tumor gene expression data and diagnosis of tumor.And the Improved PSO algorithm is added to improve the classification weight of WELM classifier,and the classification effect and speed of the classifier can be improved effectively.The main contents and achievements of this paper are as follows.First,we research on algorithm of WELM.Aimed at the problem of data imbalance,the Improved Weight Extreme Learning Machine was applied to the classification of tumor gene expression data.Different from the ELM,the Improved WELM adds an adjustable classificatory weight for different categories,and the weight of the class is determined by learning the training set data,and the classification accuracy is improved when the data of the class imbalance is classified,and As the weight of the class can be adjusted,it also has a good classification for the balanced data set.Experiments show that this algorithm can improve the accuracy of classification of imbalanced data and keep good performance for balanced data.Second,in order to improve the efficiency of the WELM,a Particle Swarm Optimization algorithm is proposed to improve the effciency of the WELM algorithm.Experiments show that the proposed algorithm can classify gene expression data quickly and effectively,and improve the efficiency of the algorithm based on ensuring the classification accuracy.
Keywords/Search Tags:Gene expression data, Category imbalance, Weighted extreme learning machine, Improve particle swarm optimization
PDF Full Text Request
Related items