Font Size: a A A

Mining Closed Sequential Patterns Based On Bitmap

Posted on:2009-06-26Degree:MasterType:Thesis
Country:ChinaCandidate:X J WangFull Text:PDF
GTID:2178360242998300Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Sequential patterns mining is one of the important research areas in data mining. Its primary goal is to discover previously unknown, interesting relationships among attributes from large databases. Most classic algorithms of sequential patterns mining which have been proposed are focused on mining the complete sets of frequent patterns, The performance of those algorithms'space efficiency is low. The mining of closed sequential not only provides the same information, but also is more compact and effective. The work of this dissertation aims at the problems mentioned above. Therefore, in this paper we put the emphasis on mining frequent itemset. The main research is as follows:1.Certain classic sequential pattern mining algorithms are thoroughly studied. Mainly study focus on the algorithms without the candidate sequential generated which based on the prefix projection database and the algorithms SPAM based on the bitmap. Qualitative analysis are made in this paper, deeply, we compared the efficiency of these algorithms and conclude the virtue and the shortage of each algorithm.2.Based on the CloSpan algorithm, the classic algorithm of closed sequential patterns and referred to the data structure which the algorithm SPAM adopted, we used Bitmap to denote the database, and designed the Closed sequential pattern mining algorithm CSPBB based on Bitmap, which is a Depth priority algorithm, it's object is the sequence database denoted by bitmap and the prefix projection algorithm is adopted. Analysis and experimental Comparison shown that, the algorithm CSPBB is excelled than the algorithm CloSpan in the time and space costs.3.Referred to the Multi-dimensional sequential pattern mining algorithms UniSeq and HTSeq, we designed the multi-dimensional closed sequential pattern mining algorithm Mul_Clo_Seq. The main idea of this algorithm is splitting the multi-dimensional sequence database, and the closed sequential patterns and the frequent multi-dimensional information mining are made separately. Then combining the closed sequential patterns and the frequent multi-dimensional to generate the multi-dimensional candidate closed sequence pattern, at last ,candidate sequence pattern pruning are made, we get the multi-dimensional closed sequential patterns. Analysis and experimental Comparison shown that, the algorithm Mul_Clo_Seq is excelled than the algorithm UniSeq in the efficiency.
Keywords/Search Tags:Data Mining, Sequential patterns, Closed sequential patterns, Multi-dimensional Closed Sequential Patterns, Bitmap, Hash-tree
PDF Full Text Request
Related items