Font Size: a A A

Research On Algorithms Of Frequent Pattern Mining

Posted on:2008-08-11Degree:DoctorType:Dissertation
Country:ChinaCandidate:L Q ZhanFull Text:PDF
GTID:1118360215459726Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Frequent pattern mining is a basic problem of the data mining area, and its research includes frequent pattern mining of various data such as item set, item sequence and time series etc. Its methods are widely used in other data mining areas, such as association rule, classification and clustering, periodic analysis, and similarity query etc. The topic of frequent pattern mining has attracted more and more researchers' attention because of its basic nature and inner complexity.This dissertation makes a research on algorithms of frequent pattern mining, and focused on the following aspects: total and partial frequent pattern mining of item set; frequent Web access path mining; and frequent pattern mining of time series. The major contributions of this dissertation as following:By making a deep research on FP-tree, which is a kind of data structure frequently used in frequent pattern mining algorithm, several novel process methods are proposed, including disassemble, merging, and projection of FP-tree. By using these novel process methods flexibly in frequent pattern mining algorithm, the efficiency of the algorithm can be improved.Two algorithms, FP-DFS and FIPT, are proposed for item set frequent pattern mining. The former is an algorithm for total frequent pattern mining, and the later is for partial frequent pattern mining. FP-DFS takes FP-tree as its basis data structure, but no longer constructs FP-tree recursively by using condition pattern base, nevertheless the algorithm improves its searching efficiency, and diminishes its main memory space occupation, by using the novel process methods of FP-tree, novel searching strategy, and novel pruning strategy proposed by this dissertation. The characteristic of FIPT is to combine Gallias Lattice with FP-tree, which improves the efficiency of Gallias Lattice modification by applying the novel process methods of FP-tree proposed by this dissertation, and solves the efficiency problem of transactions batch process. The constructed lattice can be used in incremental mining.For the case of frequent pattern mining of item sequence, the dissertation is focused on solving the problem of frequent Web access pattern mining. A novel Web transaction model is proposed, then a novel hybrid algorithm for Web access pattern mining named WDHP is proposed based on this model. WDHP carries the ideas of DHP to filter candidate set and prune database. When the database has been pruned to a proper size, it is stored in main memory by constructing FP-tree, and then the consecutive steps of the algorithm are executed in main memory, by which reduces the occupation of main memory space and improves the executive efficiency of the algorithm.For the case of frequent pattern mining of time series, the dissertation firstly analysis the existing problem of subsequence clustering, then proposes a novel clustering algorithm based on wavelet filter. Furthermore, a novel algorithm based on wavelet filter for frequent pattern mining of time series, Frequent-Wavelet, is proposed. The basic idea of Frequent-Wavelet is to pass time series through the a'trous smooth filters set, and then cluster the derived subsequences, so the problem of FTS mining is converted to the problem of FIS mining. The algorithm successfully solves the problems of trivial similarity and time axe stretching of time series frequent pattern mining, so compared with other similar algorithms, the algorithm can be more effective to find frequent patterns in time series.
Keywords/Search Tags:Frequent Pattern Mining, FP-tree, Frequent Access Path, Wavelet Transformation
PDF Full Text Request
Related items