Font Size: a A A

Research On Theory And Algorithms For Mining Association Rules

Posted on:2004-11-26Degree:DoctorType:Dissertation
Country:ChinaCandidate:X M LiFull Text:PDF
GTID:1118360095456609Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the scope and fields of computer application expanded increasingly, and in particular the rapid development of the Internet,large even huge amount of data has been produced in various application systems and on the Internet,resulting in the problem and phenomenon of "data explosion and knowledge scarcity";Data mining is the most effective method to tackle the problem.It includes lots of measures such as association rules mining,classification and prediction,clustering analysis and evolvement analysis.The main technique among the data mining measures is the association rules mining,which is also the most widely used data mining measure.In 1993, the concept of association rules was first proposed by Dr. Rakesh Agrawal who was working at IBM to describe the relation between transactional items in a transaction database,i.e. the frequent relation.Many researchers have been studying it for more than 10 years,and made many achievements.But there are still many problems to be solved urgently.This paper gives a detailed introduction to it,and makes in-depth research on the theory,especially the algorithm of association rules mining,with certain achievements.The author divided association rules mining into five stages and put forward the MMAR model.The proposed model modifies and improves the Agrawal's two-stage model,and makes it more line with current researches and more significant in guiding future association rules mining researches.In the research of association relations,the author has proposed an extended association relation and an extended association rule.The extended association rules cover the basic association rule proposed by Rakesh Agrawal.From the point of view semantics,the latter is only a special case of the former.The extended association rule has practical value as well as theoretical significance.The extended association contains both positive frequent relation and negative frequent relation,while the basic association rule contains only positive frequent relation.In addition,several theorems concerning the calculation of the support of the extended association rules have been proven ,and the theorems are used to establish an effective algorithm of extented association rule mining.Usually,a problem of excessive amount of association rules resulted frommining.The existing method to solve the problem relies on interest degree and association rule mining with constrains.However, the effectiveness is limited.Therefore,the author proposed atom association rules.The atom association rule has very strong capability of rule reduction and rule inferring.Due to the rule reduction capability,the number of association rules can be reduced enormously when mining.Due to the rule inferring capability,other association rules can be obtained after the atom association rules are produced.Moreover, the loss of knowledge is also avoided.Atom association rules can scale down the number of association rules several times even tens of times.In addtition,the author proposed a two-stage mining algorithm for produceing the atom association rules,in which the first stage is that the source association rule is produced from frequent items,the second stage is that the atom association rules is produced from source association rules.The mining of the frequent items of association rules is the most important area of mining researches.At present,most researches mainly focus on how to increase the efficiency of the mining of frequent items.Current researches increase the mining efficiency of association rules mainly by lifting the effectiveness of serial algorithms,by using parallel and distributed algorithms,and by incremental mining algorithms.In order to further improve the efficiency,the partial or specific association rules mining which including the mining algorithms of mining the most frequent items and mining the closed set frequent items is proposed.In virtue of in-depth study,the author found that serial algorithms of association r...
Keywords/Search Tags:Data mining, MMAR model, extended association rules, atom association rules, horizontal optimizing strategy, vertical optimizing strategy
PDF Full Text Request
Related items