Font Size: a A A

Research On Feature Gene Selection Method Based On Sample Weighting

Posted on:2014-04-02Degree:MasterType:Thesis
Country:ChinaCandidate:L YangFull Text:PDF
GTID:2268330425483627Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Molecular biomarker’s identification and validation is an important challenge of the tumor genome research for tumor diagnosis, prevention, and treatment. As the clinical trials and biological verification test requires a lot of time and manpower, choosing some important biological candidate molecule numerals which are used for verification is critical, and in the study of various types of tumor diseases, the gene expression data has been widely used to identify candidate feature genes. From the machine learning perspective, gene selection can be considered as high-dimensional data of feature selection, and its purpose is to select the optimal smaller feature subset for explaining sample phenotypic differences. At the same time, the strong robustness of gene selection will enhance the enthusiastic of medical researchers. In order to improve the robustness gene selection method and to ensure the accuracy of classification, this paper presents a gene selection method based on the sample weighted.Firstly, because different sample has different contribution to the feature selection method, our method map original feature space to margin space, and then analyze the sample margin between samples. If a sample is significant different from others, its absence or existence has bigger influence to gene selection method than other samples. Thus, we can set a smaller weight to it to reduce its influence.Secondly, in order to further improve the robustness of the algorithm, we integrate multiple criteria to evaluate every gene based on sample weighted. It takes into account not only the complementarity between the multiple criteria, and at the same time can fully consider the relative importance between the samples, so that the evaluation of each gene is more objective, more comprehensive and greatly improve the robustness of the algorithm. And, in order to quickly search a better combination of genes and to avoid the combinatorial explosion of gene interaction analysis, we use ant colony algorithm to heuristic search gene combinations space.Lastly, we carry out comparison experiments on real data sets, and the experimental results show that the method is effective to retain margin effect gene which is neglect by single criterion, thereby enhancing feature gene subset classification accuracy rate, and the method has better robustness.
Keywords/Search Tags:Gene chip, Gene expression profile, Gene selection, Robustness
PDF Full Text Request
Related items