Font Size: a A A

Studies On Algorithms Of Association Rule Mining In Data Mining

Posted on:2002-08-09Degree:DoctorType:Dissertation
Country:ChinaCandidate:F GaoFull Text:PDF
GTID:1118360062475198Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Knowledge is strength. With the repaid development of information technology, the development of e-commerce and development of WWW applications, massive amounts of data have been continuously collected in the databases of many application areas, which contain much useful patterns, and it is very important to find the hidden and previously unknown information for these areas, data mining aims at the task of the above work. In recent years, some new concepts and theories of data mining have been proposed, and many data mining products are also presented by some word important IT companies(such as IBM, Oracle and Microsoft, etc.). Association rule mining is a form of data mining to discover previously unknown, interesting relationships among attributes from large databases. Due to its simple form and being easy to understand, association rule mining has attracted great attention in database, artificial intelligent and statistics communities, and a lot achievements have been made in its study. Compared with artificial method, such as neural network, genetic algorithm and statistics, it can processes larger dataset, on the other hand, artificial method usually processes a small set of data, and it aims at finding a model between inputs and outputs. Association rule mining can find large number of patterns among attributes. Furthermore, although large datasets can be processed in statistics, these work aims at finding data distributions or statistical model. Supported by national 863 project, this dissertation mainly focus on some key problems, including association rule mining with item constraints, association rule mining with fuzzy quantitative constraints, optimized association rule mining, web usage mining and quantitative association rule mining using statistical method. Some new definitions, theorems and algorithms are presented and tested, and some problems in both theory and practical applications are solved successfully. The major achievements of this dissertation are:In chapter 2, the categories of constraints in association rule mining are introduced; definitions, theorems and algorithms of association rule mining with item constraints are presented. Its current development is detailed from the point of technology; some associated concepts and definitions are explained. The effective algorithm presented in this chapter is very suitable for mining association rules with low support and long patterns.In chapters, the problem in association rules mining that fail to consider the quantitative information associated with items is solved based on fuzzy theory. Associated definitions and algorithms are presented. Fuzzy query and rule templateconcepts are combined effectively, formulas and complete mining method are presented, experiments are designed. The experiment results show the method presented is a useful guide for mining association rules with quantitative constraints.And in chapter 4, the problems of optimized association rule mining are discussed. Some definitions and theorems of unexpected association rules are presented. Two kinds of unexpected association rules are defined, one kind is unexpected template rules, another kind is association rules with different consequent with template rules, this kind of rules are the final results presented to users. Associated algorithms are presented, method of pruning itemsets with X2 test is proposed, method of ordering the second kind rules using information gain is also presented, and the idea that the larger the information gain, the larger the interest measure is pointed. In design of algorithms, the adjusted Apriori framework is presented, it generates small sets of itemsets, leading to efficiency improvements of the algorithms.In chapter 5, the problems of web usage mining are discussed. The definitions and inclusion relations of clustered record, client record and client sequence are presented, this gives the basis of further design of algorithms. In discussing the algorithms of mining method, this dissertation considers much in time cons...
Keywords/Search Tags:Data Mining, Fuzzy Quantitative Constraints, Association Rule Mining, Web Usage Mining, Quantitative Association Rule Mining, Item Constraints
PDF Full Text Request
Related items