Font Size: a A A

Multi-dimensional Multi-layer Data Mining Algorithm Mpfp Design And Its Application

Posted on:2005-02-07Degree:MasterType:Thesis
Country:ChinaCandidate:L QianFull Text:PDF
GTID:2208360125961100Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
As a new technology boomed in the mid-1990s, Data Mining represents a key step in the procedures of knowledge discovering and is also a hot research topic in the domain of knowledge discovering.The discovery of association rule is an important task in data miming. Association rule represents some association relation's rule between a set of objects. Generally, there are two standard to measure a rule: support and confidence. The study of mining association rule aims to find the following rules: their support and confidence are more than the user's minimum respectively. For a long time, a category of Apriori-like algorithms has been adopted for mining frequent patterns. But they suffer from taking many scans of databases for huge number of candidate pattern occurrence frequencies checking.FP-Growth algorithm adopts pattern fragment growth method and only scans database twice. It is about an order of magnitude faster than the Apriori algorithm. However, it still has disadvantages and deficiencies. Hereinafter, the three aspects is the embodiment of its disadvantage and deficiency: (1)FP-Growth algorithm can only mine the singte-tevel, single-dimensional frequent patterns. And it can set one minimum support. So it will lead to lose frequent patterns which have lower support.(2)When database is very huge or setting a small minimum support, constructing a FP-tree base on the whole database can not put in EMS memory. It makes FP-growth algorithm cannot mine the large-scale database very well.(3)ln the process of constructing FP-tree, it will be must judge every frequent item in the transaction and think of how to insert it into the tree. This means infect the efficiency of the FP-growth algorithm badly.Aiming at the disadvantages and deficiencies of FP-growth algorithm, I design a new algorithm - MPFP algorithm. The new MPFP algorithm resolves the disadvantages and deficiencies of FP-growth algorithm commendably. MPFPalgorithm has three strongpoints:(1)lt can mine multi-dimensional, multi-level data and take associational rules thought setting many minimum support in different level.(2)For the large-scale database, the MPFP algorithm adopts to partition the large database to many projection database. And It construct FP-tree base on the projection database.(3)ln the process of constructing FP-tree, MPFP algorithm adopts a way which integrate the technic of tree and projection. It constructs FP-tree according to level. MPFP algorithm has a good retractility and simultaneity the performance of system has been improved consumedly.Based on the new mine association rule algorithm--MPFP algorithm andtaking the daily retail business of the shipping trade into consideration, the author designs a shipping trade -oriented data mining model : RS-MINER. In the realization of the mining model of RS-MINER, the author employs JAVA develop language which support multi-platform and the object- oriented method of designing and developing. Meanwhile, the author has worked a lot in knowledge expression and explanation to enable that the knowledge is not only demonstrated by digit and symbol but tables and graphics which are easily comprehended, the author also explained and evaluated the result of data mining --the frequent pattern .Taking shipping trade as its background, RS-MINER mine model is characterized by perfect function, simple operation and strong extensibility.
Keywords/Search Tags:data mining, association rules, Apriori Algorithm, FP-growth Algorithm, confidence, support
PDF Full Text Request
Related items