Font Size: a A A

Research On Condensed Sequential Pattern Mining Based On Tree Structure

Posted on:2011-02-22Degree:MasterType:Thesis
Country:ChinaCandidate:Y J JieFull Text:PDF
GTID:2178330338991285Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
The existing sequential pattern mining algorithm can efficiently mine the complete set of sequential pattern in a large database. However,in many application, the user may want to find the more succinct patterns, rather than all of the patterns. The paper mainly study on how to mine compress sequential patterns,how to mine compress sequential patterns incrementally, and how to mine compress repetitive gapper sequtnial pattern, the study of those patterns have an import significance in analysis of customer purchasing lists, analysis of web log access, analysis of DNA sequences, credit card usage histories traces, program execution traces, and the analysis of any other sequences corrlete to the time.Firstly, an efficiently maximal sequential patterns mining algorithm CSMS is presented. This algorithm makes use of the method of matching sequences extension based on a positional table, at the same time, we build a PStree to maintain the candidate sequential patterns, we obatian the final maximal sequential patterns through pruning for the PStree. The algorithm CSMS has better time efficiency and scalability.Secondly, a closed repetitive gapped sequential pattern mining algorithm based on repetitive linked WAP-Tree is proposed. The algorithm build a postitional table for all the frequent items, then searching the postitional table for all the 2 sequential patterns which are composed of the different items, algorithm also construct a repetitive linked WAP-Tree, through mining the project tree of the existing patterns in RLWAP-Tree gradually, we can get all of the closed repetitive gapped sequential patterns.At the end, a close sequential pattern mining algorithm for discovery the feature of software bugs MSPT and an incremental algorithm UMSPT is proposed. Algorithm MSPT searches for the semi-frequent and frequent 2 patterns based on the positional information of the items, than an BStree is constructed to maintain the semi-frequent an frequent items, through the technology of projection, we can find the semi-closed and closed sequential patterns. Algortithm UMSPT inserts the new bugs sequences to the BStree, through searching the new branch to find the new semi-closed and closed sequential patterns. At last, all of the final semi-closed and closed sequential patterns can be obtained by checking the inclusion relation of the existing patterns. Algorithm MSPT and UMSPT has better time efficiency.
Keywords/Search Tags:Positional information table, Sequence extension matching, WAP-Tree, Projection technology, Bug feature discover
PDF Full Text Request
Related items