Font Size: a A A

Distributed Mining Algorithm Of Dda Design And Dadm Model

Posted on:2005-03-26Degree:MasterType:Thesis
Country:ChinaCandidate:Z J XieFull Text:PDF
GTID:2208360125461101Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
As a new technology boomed in the mid-1990s, Data Mining represents a key step in the procedures of knowledge discovering and is also a hot research topic in the domain of knowledge discovering. In recent years, the academic circle and enterprises have attached importance to and achieved some results in the research and development of the data mining techniques and software tool. The discovery of association rule is an important task in data miming. Association rule represents some association relation's rule between a set of objects (for example "concurrently happened" or "deduce from one object to another"). It can be simplified as X=>Y, X is the premise of the rule, Y the result. Generally, there are two standard to measure a rule: support and confidence. The study of mining association rule aims to find the following rules: their support and confidence are more than the user's minimum respectively. The difficulty of the research lies in too large amounts of data(several GB byte or even TB byte), and therefore the efficiency of algorithm is the key. The current research places its emphasis on how to discover large itemset, and R.Agrawal and others put forward the Apriori Algorithm in 1994. It is a classic frequent set method but it has its inherent defaults: 1 repeated scanning of database increased the I/O times in mining, thus adding to CPU's burden and affecting the computing efficiency.2It only be applied to centried database,not be applied to distributed database 3.cannot analyze the scarce data.This paper analyses the Apriori Algorithm from the theoretical and practical perspectives and to make up for its deficiency designs a new algorithm can applied to distributed database, which means: 1. All data is distributed to the local database 2. The local database run the Apriori Algorithm to produce local large itemset , Local database broadcast local large itemset to other site 3.After all the site has receive the broadcasted Large itemsetthe database is scanned only once to transform the transaction information of the database to bit constructure, which will be the basis of the following mining, and thus avoiding unnecessary database scanning so as to ease the I/O burden of the system and achieve a better efficiency.Based on the new mine algorithm and taking the daily retail business of the supermarkets into consideration, the author designs a supermarket-oriented data mining model : DADM. In the realization of the mining model of DADM, the author employs JAVA develop language which support multi-platform and the object-oriented method of designing and developing. Meanwhile, the author has worked a lot in knowledge expression and explanation to enable that the knowledge is not only demonstrated by digit and symbol but tables and graphics which are easily comprehended.Finally the thesis concludes the method of designing mine algorithm and mine model which means a new train of thought for the design and research of data mining system for the supermarkets.Taking large supermarket as its background, DADM mine model is characterized by perfect function, simple operation and strong extensibility. Meanwhile, it is not confined in one area when developed for the second time, and by analyzing special domain data, the modal can be applied in the industries of banking, insurance, meteorology, and so on.
Keywords/Search Tags:DataMining, AssociationRules, ApriorilAlgorithm, DDA Algorithm, DADM Mine Model, DDB
PDF Full Text Request
Related items