Font Size: a A A

Research And Application Of Complex Data Classification Algorithm

Posted on:2015-06-14Degree:MasterType:Thesis
Country:ChinaCandidate:D F ChenFull Text:PDF
GTID:2348330485494271Subject:Information management and information systems
Abstract/Summary:PDF Full Text Request
Now, the rapid development of modern e-commerce and the advancement of knowledge management technology make data collection become easier and faster. How to mine the useful information from a mass of data has become an important problem in many research fields.Recently, the “dimension disaster” problem existing in the complex highdimensional data has become a focus in the field of data mining. Classification is an effective method for data analysis, but with the datasets’ dimensions becoming more and more higher, some irrelevant or redundant features greatly increase the difficulty of classification processing.Aiming to get the distinguishing feature subspace for each class, and then improve the classification accuracy, a novel two-stage class-dependent feature selection method is proposed. The presented method is a Filter-Wrapper hybrid approach based on minimal-redundancy–maximal-relevancy(mRMR) and genetic algorithm, where the speed and performance are improved with the high efficiency of Filter and the great accuracy of Wrapper. A set of top ranking features is filtered by mRMR in the first Filter stage. Then a GA-SVM is presented to optimize the feature subspace for each class in the following Wrapper stage, where SVM is used as the classifier. SVM parameters and feature subsets were optimized simultaneously for each class through GA. The experiment results show the presented method can effectively remove irrelevant or redundant features for each class, while improving the classification accuracy in comparison with other feature selection approaches for classification.Finally, this paper constructs the basic framework of a referral marketing system, in which our class-dependent feature subspace selection classification algorithm is applied to the recommendation. In the referral marketing system, recommendation algorithm learns the subspace model from the customers’ product or service records, and constructs the consumption feature subspaces, which are used to judge the classification of new customers and the recommended marketing for new/old customer. This system framework can recommend related products or services to customers through text or other approaches based on different consumers’ preference types.
Keywords/Search Tags:Classification, Feature selection, mRMR, Support vector machine, Genetic algorithm
PDF Full Text Request
Related items