Font Size: a A A

The Design And Application Of Association Mining Matrix Algorithm Based On Equivalence Class

Posted on:2009-10-01Degree:MasterType:Thesis
Country:ChinaCandidate:H M WangFull Text:PDF
GTID:2178360242998321Subject:Applied Mathematics
Abstract/Summary:PDF Full Text Request
Association rules is one of the most knowledge representation method.Applied to archival information manager,the association rule mining technology can provide active service for the users and find the age structure, post structure reasonable or not and can find the reason of some teachers off the school, which will help relevant departments to work out management policies to strengthen the building of the contingent of teachers, thereby greatly enhance the level and work efficiency of our archive information management. However, the existing association rule algorithms may generate a great number of candidate itemsets which affect the efficiency of algorithms, or haven't considering the relationship between transactions, which are a large number in some database, such as personnel archive database. so how to improve the existing association rule algorithm to use in the archive management information system application to become a hot issue in the present study.The article proposed association mining matrix algorithm based on equivalence class (hereinafter referred to EC-AMMA algorithm).The algorithm storaged transaction sets with bool data matrix,searched in the databases only once, compartmentalized equivalence class to achieve the reduction of transaction and didn't given birth to any candidate itemsets during operation. Because all the frequent itemsets implied in the maximal frequent itemsets, the algorithm mined association rules by seeking the maximal frequent itemset. First transform the data which got by scanning database into bool data, and then storage them in the bool data matrix. Considering the correlation between transactions, to compartmentalize equivalence class according to data matrixu, then reduce the data matrix from the direction of row and column using equivalence relation and the nature of frequent itemsets .At last scan the reducted data matrix, get the maximal frequent itemset directly without candidate itemsets, moreover get the association rules.It is proved that EC-AMMA algorithm can reduce the complexity on time and room effectively.the efficiency is over five times than Apriori algorithm when the frequent itemsets dimension K is greater than 28. We initially applied to the results of research on the management of personnel archives, and achieved better results.The main contributions are as follows:1)Utilizing the relation of equivalence class to reduce the transaction sets.2)Puting forward the concept of item similitude degree to find all most frequent itemsets directly whithout candidate itemsets.3)Applying the EC-AMMA to mine archive database and achieve better results.
Keywords/Search Tags:association rules, equivalence class, item similitude degree, data mining
PDF Full Text Request
Related items