Font Size: a A A

Extracting Non-Redundant Association Rules From Concept Lattices

Posted on:2009-02-08Degree:MasterType:Thesis
Country:ChinaCandidate:R MiaoFull Text:PDF
GTID:2178360242498211Subject:Applied Mathematics
Abstract/Summary:PDF Full Text Request
In this information age, databases are piling up huge volume data. For getting useful information from this"data sea", knowledge discovery in database (KDD) emerges as the most hot research field. The association rule-mining problem is one of the most studied and the most popular KDD tasks. Association rule mining is an important sub-branch of data mining, which describes the potential relationships between attributes andvariables in databases.The chief task of association rules mining is to find the frequent itemsets. The algorithms for finding frequent itemsets can be sort as three groups: 1. Levelwise algorithms. Apriori algorithm is a most typical algorithm. However, the perform efficiency and effect of this algorithm won't be very good. 2. Algorithms that the frequent itemsets are found by finding the largest frequent itemsets, for example, Pincer-Search algorithm and so on. However, for the objections on its theoretical basement, it will lose information when generating assocaiton rules using the result. 3. Algorithms that finding frequent itemsets by discovering the frequent closed itemsets, i.e., algorithms based on formal concept analysis (FCA) and Concept Lattice. The main idea of this kind of algorithms is to find closed frequent itemset firstly, and then get all frequent itemsets from the result. The performing effect will outgo the Apriori kind of algorithms, and the association rules can be found without any losing at same time.But the algorithm of association rule-mining at present only will get all association rules simply by minimal support and minimal confident, in which a common problem is that a large number of rules are often generated from database. It makes very difficult for users to analyze and make use of these rules. This is particularly true for data sets whose attributes are highly correlated. Therefore, we must carry on processing to the association rules, in order to well understand these rules by mining. As the concept lattice in formal concept analysis has good mathematical properties, we deem the concept lattice model a very ideal tool for data mining. If we can extract succinct association rules from concept lattice, it may extricate users from large numbers of association rules, which in order to make users to find more useful information. Simultaneously enhance the speed of mining rules, which has the vital significance to data mining. Based on the known existence currently of association rule-mining algorithms, this paper emphasizes the research work of deleting redundant association rules. Through analyzing the mathematics property of concept lattice and nature of redundant association rules, we find that we can get the minimal antecedents and maximal consequents of rules by the intent relations between a sub-concept and its sup-concepts as well as the minimal set nature of intent of concept. Using such rules we may deduce other rules with the same minimal support and minimal confidence, thus production integrity association rules.Contributions:1. This paper proposes a methond to eliminate redundant rules based on FCA through the research of form and nature of redundant rules in association rules.And proposes theory of two concepts: the basis of non-redundant exact association rules and he basis of non-redundant approximate association rules based on definition and nature of non-redundant association rules, and designs the generating algorithms of these two bases.2. This paper presents a new algorithm called NARG to extract non-redundant association rules based on concept lattice and properties of redundant association rules. Experiments in this paper proves that this algorithm can gain the minimal non-redundant set of association rules while effectively improve efficiency of extracting rules without losing any useful information of data.3. Apply all the thoughts into practice: 1) Integrate in the IsoFCA system; 2) obtained the successful and effective application in the financial revenue and expenditure analysis.
Keywords/Search Tags:data mining, concept lattice, association rules, rules basis, minimal non-redundant rules
PDF Full Text Request
Related items