Multi-level Mutli-dimesional Frequent Itemset Mining

Posted on:2005-07-30

Degree:Master

Type:Thesis

Country:China

Candidate:G H He

Full Text:PDF

GTID:2168360125963898

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

What is the trouble when the people face to the "information bomb"? It is difficult to get useful informations from the sea of the data quickly. KDD coming for the need has become one of the strongest weapons that people can use to solve the paradoxical problem. Data mining is the nontrivial process of identifying valid, novel, potentially useful, and ultimately understandable patterns in data.Algorithm is the key part in KDD, because it is crucial to efficient of KDD. On one hand, data mining is used to process large database, and so the efficiency of algorithm is the most important; on the other hand the computer in use is not satisfied to the processing of Large database. Consequently, we should modify present algorithm to fit the need which we refer above. This paper studies the Sequence Mining Algorithm deeply. Apriori-based Algorithm need scan database many times, which decreases the efficiency of Apriori-based Algorithm. At the same time, Apriori-based Algorithm produces a large number of candidate sets. FP-tree Algorithm is a revolution of Apriori-based Algorithm, because it only need scan database two times. But FP-tree Algorithm pushes uniform minimum support, which losses the advantage of the algorithm. Usually, real life transaction database contain both item information and dimension information. Knowledge about multi-level and multi-dimensional frequent itemset is interesting and useful. The classic frequent itemset mining algorithms based on a uniform minimum support, either miss interesting patterns of low support or suffer from the bottleneck of itemset generation.In this paper, we extend FP-growth to attack the problem of multi-level multi-dimensional frequent itemset mining. We call it E-FP. To increase the efficiency, we push various support constraints into the mining process. Our E-FP algorithm can discover both inter-level frequent itemset and intra-level frequent itemset. Moreover, we take dimension into account in our E-FP algorithm. We show that our E-FP algorithm is more flexible at capturing desired knowledge than previous studies.Clustering analysis has been a very active area of research ã€‚ It has been applied in data mining, web mining, E-commence etc. However, most algorithms ignore the fact that physical obstacles exist in the real world and could affect the result of clustering dramatically. In this paper, we will explore the problem of clustering in the presence of obstacles. We provide an algorithm called ADP-Chameleon to solve it.

Keywords/Search Tags:

data mining, FP-tree Algorithm, multi-level multi-dimensional itemset, E-FP Algorithm, clustering

PDF Full Text Request

Related items

1	Research On Projected Clustering Algorithm And Its Applications
2	Research On Multi-level Association Rules Algorithm And Decision Tree Algorithm In Data Mining
3	The Research On The Algorithm Of Multi-level Association Rule Mining
4	Multi-Relational Frequent Pattern Mining Algorithm And Its Application Research
5	Research On Multi-Dimensional Association Rules Mining
6	The Research Of Multi-dimensional And Multi-level Association Rule Algorithm
7	Multi-dimensional Multi-layer Data Mining Algorithm Mpfp Design And Its Application
8	Study On Multi-Level Association Rules Mining Algorithm Based On FP-Tree
9	Data Mining Technology Research And Application In The Tobacco Business Crm
10	The Analysis And Application Of Clustering Algorithm For Multi-Dimensional Data Streams