Font Size: a A A

Research On Mining Algorithms Of Maximal Frequent Item Sets

Posted on:2006-11-21Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y J YanFull Text:PDF
GTID:1118360185463418Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the development of information technology, especially the emerging of the network technology, our abilities to collect, store and transfer data have been improved dramatically. Comparing to the explosive growth of data, the needs for decision-relevant knowledge are not satisfied yet Knowledge discovery and data mining technology is an important approach to address this problem.As one of the main patterns in the field of data mining, association rules are used to determine the relationships among the attributes or objects, to find out valuable dependencies among the fields. The efficiency of mining frequent item sets is the key problem in association rules generating. Frequent item sets can be divided into three types: complete, closed and maximal. This dissertation focuses on the mining of maximal frequent item sets. The prune strategy, the ordering policy of tail item set and superset checking are studied thoroughly based on representation of datasets and the sets of maximal frequent item sets.Bitmap is an efficient representation structure of datasets and the sets of maximal frequent item sets. A depth-first mining algorithm-DFMfi which is based on bitmap is proposed in the dissertation. By utilizing the byte characteristic, DFMfi can optimize the mapping and unifying operations on the item sets. Moreover, for the first time a method based on bitmap which uses local maximal frequent item sets for fast superset checking is employed. DFMfi's correctness is proved. And experimental comparison with previous works indicates that DFMfi can obviously accelerate the generation of maximal frequent item sets.After near 10 years research and development, researchers begin to pay more attention to another compress representation structure of datasets, namely FP-Tree. The second part of the dissertation thus concentrates on the mining of maximal frequent item sets based on FP-Tree. Analysis and experimental results show that the superset checking operation is time-consuming and is frequently used in the mining process. Based on the observations, a new superset checking method based on projections of maximal frequent item sets is presented for the single MFI-Tree. And a simple method for fast superset checking is proposed for multiple MFI-Trees. Two algorithms, FPMFI and FIMFI, which are based on single and multiple MFI-Trees, are proposed respectively. In FIMFI, the item ordering policy of tail item sets is discussed and a new efficient ordering policy based on the...
Keywords/Search Tags:Data Mining, Association Rule, Frequent Item Set, Maximal Frequent Item set, Lookaheads Pruning, Superset Checking, Frequent Pattern Tree(FP-Tree), Maximal Frequent Item Set Tree(MFI-Tree), Combined Frequent Pattern Tree(CFP-Tree)
PDF Full Text Request
Related items