Cancer is the major diseases affecting human health and it has a high mortality. Preven-tion and treatment of cancer has become the focus of scientists around the world. Research shows that cancer is a complex genetic disease, so the study of cancer gene expression profile and select the informative genes are the direct means for us to search cancer-related genes and Found cancer gene expression features.Gene expression data always have high dimensions and a few samples, redundant noise and less information. So in the first part of this paper we first use dispersion analysis for feature selection of colon cancer gene expression data sets. Then we designe vector classi-fication algorithm based on analysis of genetic relatedness. Using this algorithm we further select feature genes and reduce the number of genes for classification. At last, we use SVM and potential function classification method on colon cancer data set. Through the analysis of classification accuracy and time spend on classification we obtain that SVM classification on colon cancer data set have better results.In the second part, we mentioned VC dimension which is a very important concept for description of the complexity of learning machine. Based on literature [1,2], for the situation of linear indicator function set's VC dimension in n-dimensional space, this paper improved the original proof method and make the proof more general. |