Font Size: a A A

Research On Feature Selection Based On Muti-Objective Optimization

Posted on:2013-11-26Degree:MasterType:Thesis
Country:ChinaCandidate:X Y TengFull Text:PDF
GTID:2248330377459111Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Feature selection and clustering are two important issues in text categorization. Theprecision and speed of the information processing are influenced by the size and quality offeature subset. The accuracy rate of text classification is influenced by the clusteringalgorithm. However, current feature selection algorithms require the threshold which shouldbe initialized beforehand. These algorithms rely on the distribution of samples in data setexcessively. Accordingly, the process of clustering after features were selected requires thesettled number of clusteringcenters. These algorithms ignore the degree of membership whichsamples belong to every category in and the influence of samples on clustering algorithm.This paper combines evolution muti-objective optimization with feature selection.Thismethod considers several classic feature selection algorithms and takes advantages frommuti-objective optimization on searching Pareto optimal solution. The feature selection basedevolution muti-objective optimization is abbreviated as EMOO-FS. This method analysesfeature attributes and selects two kinds of feature attributes. The feature attributes performvery well in balanced data sets and unbalanced data sets respectively. So this articleconstructs the multi-objective optimization model with two selected feature attributes. Finally,we get a group of statuesque features. This feature subset will perform better than any othertraditional algorithms in data set which its distribution of samples is unknown. TheEMOO-FS algorithm alter the traditional way that only one attribute is selected in featureselection and overcome the limit of data set. This method lays the foundation for the processof clustering. For feature subsets after reduce dimensions, we consider the measurementapproaches in clustering. Both the degree of membership which samples belong to everycategory in and the influence of samples on clustering algorithm are calculated in this paper.Therefore a fuzzy clustering algorithm based fisher criterion——FDC is proposed. Thetwo-way analysis in this FDC gets over the dependence of number of clusteringcenters. Thismethod creates the number of clusters dynamically and gets the result fairly.In this paper, we compare EMOO-FS with IG,MI and CHI by two kinds of data set onF1measure and M1measure in our experiments. The experimental results show that thealgorithm can find the statuesque subset and obtain better classificationeffect. In second experiment, we compare FDC with KM and FCM by several kinds of data set on accuracyand rand value. The experimental results show that this algorithm can obtain better clusteringeffect in multi-class tabs data set and balance data set.
Keywords/Search Tags:Feature Selection, Fuzzy clustering, Evolution Multi-Objective Optimization, Fisher criterion, Feature attribute
PDF Full Text Request
Related items