Font Size: a A A

Study On Parallel For Association Rules Mining

Posted on:2004-07-22Degree:MasterType:Thesis
Country:ChinaCandidate:X S YanFull Text:PDF
GTID:2168360122966500Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the fast development of large database, huge amount of data have been stored in computers. But the existing database systems do not provide the users with the necessary and effective tools to capture all stored information easily. Therefore, automatic knowledge discovery techniques have been developed to capture and use the voluminous information hidden in large database. Discovery of association rules is an important class of data mining whose aim is to capture the co-occurrences of itemsets, the most important thing to do is to find the large itemsets effectively, because this is time-consuming and will finally decide the efficiency of algorithms. So now the main study is emphasized on how to find the large itemsets with more and more few times.In this paper, we summarize the major concept and recent development of KDD/DM. Then we give a formal problem description of mining association rules. We analyze the performance of the two typical algorithms, Apriori and AprioriTid, for discovering all significant association rules between items in a large database of transactions, then introduce the idea of some typical algorithms and analyze them virtues and disadvantages. We proposed the strategy of parallel mining association rules and describe the basic algorithms and analyze the performance of these algorithms. Through analyze the virtues and disadvantages of the serial algorithms and parallel algorithms of association rules mining, then we experiment with experiment data to evaluate the performance of the parallel algorithms based on Apriori algorithm in PVM environment. This paper just want to provide a flexible and scalable computation flat roof, in which use the low-cost PC process network computation, exert the superior of the network computation sufficiency.The paper is divided into six chapters, each part is showed as following:1. Introducing the background of the paper, summarize the research work and the paper structure.2. Introducing the data mining technology in sample.3. Introducing the association rules mining algorithm idea, summarizing the association rules produce, develop and the problem description. Introduce the Apriori and AprioriTid algorithm idea, the code of these algorithms and it's virtues and disadvantages. Then, introduce some typical algorithms idea and analyze them. At the end, discuss some problems about the association rules mining.4. Giving the kernel of the paper-the strategy of parallel mining and the algorithms. First, proposed the strategy of the parallel mining association rules, then describe the types of the parallel mining particular and the algorithms of the parallel mining association rules based on these types. At the end, introduce other parallel algorithms and analyze them virtues and disadvantages.5. Introducing the idea of a parallel mining algorithm based on Apriori algorithm, and experiment with experiment data to evaluate the performance of the parallel algorithms in PVM environment.6. Summarizing the paper and pointing out the studying and developing direction.
Keywords/Search Tags:Knowledge, Discovery in Database, Data Mining, Association Rule, Large Itemset, Parallel Virtual Machine
PDF Full Text Request
Related items