Font Size: a A A

Data Mining Of Association Rules

Posted on:2002-07-15Degree:MasterType:Thesis
Country:ChinaCandidate:J G ZhaoFull Text:PDF
GTID:2208360032456733Subject:Basic mathematics
Abstract/Summary:PDF Full Text Request
Data Mining ,in short DM ,means that interesting knowledge is extracted from database. This knowledge is connotative , unknown, potential and usable information. Development of data-mining technique brought chances to make best of data resources. Data mining has become important domain of Al and database, and has been one of measure of intelligentized information system. Association rules presented by R.Agrawal etc.[l] are important content of data mining [9,10,38] . Mining association rules has become interesting , rapidly increased domain, which has widely been used in business and science database. Many algorithms mining association rule were given[1,8,l1,18,19,2l,22,23,24,26,35] , a well known is Algorithm Apriori presented by R.Agrawal etc. [19]. Many algorithms included the idea of Apriori (l)calculating candidate sets and (2) selecting frequent itemsets from candidate sets according to threshold given by users. In this paper we research association rules. According to traditional methods in Data Mining threshold is given by users, which is based on experience of domain experts or request of users. We think of that the threshold being more objective is calculated by program based on the character in a database. A certain matrix full of 0 or 1 ,from object database, is called an attention matrix. Let M1 is an unit submatrix with maximum acreage in Mnxm(ifl short maximum unit submatrix ), all attributes of M1 are called an attention itemset. is called an attention threshold of M, where m1 is number of rows in M1 and n is number of rows in M. Research job in this paper are as follows: (1) We research algorithms calculating large unit submatrix and present six algorithms with low complexity. Unlike Algorithm Apriori our algorithm construct large unit submatrix in whole matrix (under certain condition construct maximum unit submatrix) and avoid complex process calculating candidate itemset. (2)We present One-Dimension Change and Two-Dimension Change for eliminating elements around the maximum unit submatrix. Both One-Dimension Change and Two-Dimension Change are methods optimized step by step. Our algorithms calculates maximum unit submatrix in more cases under One-Dimension Change or Two-Dimension Change. (3)We discuss a standard evaluating maximum frequent itemset. (4)We discuss a method reducing data size by deleting vertices with low degree in a bipartite graph. (5)We apply the maximum unit submatrix to a real database.
Keywords/Search Tags:Attention Matrix, Ladder full of 1, Acreage, Maximum Unit Submatrix, Attention Threshold, Jam, Attention Itemset, One-Dimension Change, Two-Dimension Change, Data Reduction
PDF Full Text Request
Related items