As an important topic in the field of data mining, sequential patterns mining has beenpaid more attention by many researchers all over the world. Sequential pattern mining hasfar-reaching significance in the field of customer purchase analysis, DNA sequenceanalysis, web access pattern forecast, disease diagnose and natural disaster forecast. Inrecent years, how to improve the mining efficiency and how to satisfy the user’srequirements are becoming the orientation which researchers should push on. The followtask has been done about this hot issue.Firstly, an improved bitmap structure is developed to store the input original dataset.It reduces the time consumption caused by accessing external-memory and initializes theoriginal data. For the requirements that users want to get the sequential patterns’ positionin database during the process of sequential pattern mining, address index table is defined.It is a two-dimensional table structure and can be used to get the sequential patterns’position in database.Secondly, an algorithm for mining closed sequential patterns based on bitmapstructure was developed, which is used to solve the problem that the smaller-supportedsuper-sequence covered the bigger-supported sub-sequence. The pruning process of thealgorithm is divided into itemset-extension and sequence-extension. In the process ofitem-extension, the candidate itemset was created by the itemset-extension tree. Thepruning process is executed by count the support after the operation on bitmap; In theprocess of sequence-extension, the sequence-extension is executed by combine thefrequent itemset noted in address index table.Thirdly, TarSpan is developed for mining user-interested sequential patterns. It startswith the target sequential patterns and then executes itemset-extension andsequence-extension to build the target sequential patterns on bitmap structure. We can getthe position of target sequential patterns by analysis the address index table. |