Font Size: a A A

Research And Implementation Of Classification System Based On Concept Lattice

Posted on:2006-03-06Degree:MasterType:Thesis
Country:ChinaCandidate:L ZhaoFull Text:PDF
GTID:2168360155452941Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
Data mining is the extraction of interest knowledge from database. Classification is an essential task of data mining that is researched by many scholars. Several popular methods of classification are decision tree classification, Bayesian classification, neural network classification, support vector machine classification, etc. Concept lattice is the heart of FCA that is presented by Rudolf Wille in German in 1982. Concept lattice structure is an effective tool for data analysis, knowledge discovery, machine learning, information retrieval and document browsing, and more especially for classification and association rules. Several FCA-based algorithms were proposed for classification task, they have differences in the learning strategy and the classification strategy and all perform well. Recent research shows that they are still many approaches to explore in order to build efficient FCA-based classifiers, so we studied the classification algorithm based on concept lattice. First, we studied how to exact valid classification rules from concept lattice model. In data mining, rule-based knowledge is a valued pattern. Most algorithms for association rules mining are come from the traditional one Apriori. However, algorithms based on this approach perform very well for weakly correlated data such as market basket data and performances drastically decrease for correlated data such as census data. One main reason is that the amount of the rules exacted from correlated data by this approach is very large, which restrict the performance. The set of concepts'intent in a concept lattice is a closure system. The intent of a concept is also a frequent closed item set. So searching for frequent item set from concept lattice can improve the efficiency of association rule mining. Moreover, in correlated database, very few frequent item sets are concept's intent. Algorithm CL_Rule define two bases for classification rules, i.e., basis for exact classification rules and basis for approximate ones, which union is a generating set for all valid classification rules. The antecedent of a rule in the bases is a reduced intent of the concept related to the classified attribute and consequent is the classified attribute. Reduced intent of a concept behave the minimize of intent, the meaning is, the intent and reduced intent of some concept has the same extent, and the extent will extended as long as drop any attribute from the reduced intent. Using the union of the two bases, we can classify efficiently and reduce the quantity of the classification rules. Second, we construct classifier using a heuristic method. To produce the best classifier out of the whole set of rules would involve evaluation all the possible subsets of it on the train data and selecting the subset with the right rule sequence that gives the least number of errors. There are 2m such subsets, where m is the number of rules, not to mention different rule sequences. This is clearly infeasible. Algorithm CL_Classifier adopts a heuristic one and sorting classification rules by a total order on the rules generated. After training the cases from train sets, our classifier select the subset of rules that can bring most performance of classification accuracy. The subset selected is used to classifying cases with unknown class label. The algorithm is inefficient because it needs to make many passes over the dataset, so we present an improved version of the algorithm that only slightly more than one pass over the dataset. Final, the classification system CLRC was implemented in C++ with the usage of standard template library. It generates concept lattice from the input dataset and exact classification rules from lattice using the approach...
Keywords/Search Tags:Implementation
PDF Full Text Request
Related items