Font Size: a A A

Methods For Complex Data Classification And The Application In Personalized Recommender System

Posted on:2013-09-03Degree:MasterType:Thesis
Country:ChinaCandidate:Y L WangFull Text:PDF
GTID:2268330392970523Subject:Information management and information systems
Abstract/Summary:PDF Full Text Request
Along with the rapid development of information technique, people haveaccumulated massive data, which is still in an exponential growth trend. In order totake advantages of the data for society and economy need, business intelligence,represented by data mining, have been widely applied. Classification is the mostpopular method in data mining. With the popularization of BI in management,high-performance classification method that deals with complex data like massivedata and high dimensional data, has been a hotspot and a difficult issue in data miningand knowledge discovery. In this thesis, we focus on the complex data classificationproblem and learn about both associative classification and subspace classification.Firstly, an associative classification method is proposed to deal with massive data.First of all, the paper defines a novel rule interestingness metric named Typicality.This metric considers both completeness and confidence of a rule to effectively avoidproducing invalid rules, which are always produced by methods under thesupport-confidence framework. Second, a three-step rule pruning strategy is proposed,which can efficiently downsize the classifier while holding high classificationaccuracy. Experiments on UCI datasets prove that the method can effectively decreasethe classifier complexity as well as improve the classification accuracy.Secondly, though associative classification can deal well with most classificationproblems, it is limited when comes to high dimensional database. To solve this, asubspace classification method based on Kernel FDA is proposed. The methodcombines the frequent pattern mining and kernel based feature extraction technique todiscover all the subspaces. In this way, it decomposes the big classification probleminto a series of small classification problems. The problem complexity is reducedsignificantly. Experiment results show that the proposed method can effectivelyhandle the high-dimensional data classification problem, and, as a result, achieve abetter classification accuracy compared to other methods.Thirdly, the proposed associative classification method is applied to construct apersonalized recommendation system model to solve real problem.
Keywords/Search Tags:data mining, classification, associative classification, subspace, FDA, frequent pattern
PDF Full Text Request
Related items