Methods For Complex Data Classification And The Application In Personalized Recommender System

Posted on:2013-09-03

Degree:Master

Type:Thesis

Country:China

Candidate:Y L Wang

Full Text:PDF

GTID:2268330392970523

Subject:Information management and information systems

Abstract/Summary:

Along with the rapid development of information technique, people haveaccumulated massive data, which is still in an exponential growth trend. In order totake advantages of the data for society and economy need, business intelligence,represented by data mining, have been widely applied. Classification is the mostpopular method in data mining. With the popularization of BI in management,high-performance classification method that deals with complex data like massivedata and high dimensional data, has been a hotspot and a difficult issue in data miningand knowledge discovery. In this thesis, we focus on the complex data classificationproblem and learn about both associative classification and subspace classification.Firstly, an associative classification method is proposed to deal with massive data.First of all, the paper defines a novel rule interestingness metric named Typicality.This metric considers both completeness and confidence of a rule to effectively avoidproducing invalid rules, which are always produced by methods under thesupport-confidence framework. Second, a three-step rule pruning strategy is proposed,which can efficiently downsize the classifier while holding high classificationaccuracy. Experiments on UCI datasets prove that the method can effectively decreasethe classifier complexity as well as improve the classification accuracy.Secondly, though associative classification can deal well with most classificationproblems, it is limited when comes to high dimensional database. To solve this, asubspace classification method based on Kernel FDA is proposed. The methodcombines the frequent pattern mining and kernel based feature extraction technique todiscover all the subspaces. In this way, it decomposes the big classification probleminto a series of small classification problems. The problem complexity is reducedsignificantly. Experiment results show that the proposed method can effectivelyhandle the high-dimensional data classification problem, and, as a result, achieve abetter classification accuracy compared to other methods.Thirdly, the proposed associative classification method is applied to construct apersonalized recommendation system model to solve real problem.

Keywords/Search Tags:

data mining, classification, associative classification, subspace, FDA, frequent pattern

Related items

1	Research And Implement Of A Frequent Pattern List Based Associative Classification Algorithm
2	Research On Association Rules Mining And Associative Classification Based On Bit Table
3	Study On Associative Classification Based On Closed Frequent Itemsets
4	Research On An Associative Classification Algorithm To Data With Uncertain Attribhutes
5	Objective Interestingness Measure And Its Application In Associative Classification
6	Research On The Frequent Substructure Mining Algorithm For Graph Classification
7	Research Of Mining Frequent Patterns And Classification On Data Straems
8	Research On All Frequent Itemsets Mining Algorithm And Its Application To The Classification Area
9	Associative Classifier For Uncertain Data
10	Text Classification Method Based On The Longest Closed Frequent Sequential Patterns