Font Size: a A A

The Research On Sequential Pattern Mining

Posted on:2011-03-05Degree:MasterType:Thesis
Country:ChinaCandidate:Y J WuFull Text:PDF
GTID:2178360305472748Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Sequential pattern mining is an important branch of Data Mining. In the applications of the financial, communications and the other areas, the sequential pattern mining has played an important role. Although so far, the algorithms of sequential pattern mining have been relatively mature, but most of algorithms are for static sequence data. And the same time, the data in the reality is constantly updated, so how to improve the efficiency of a contemporary space-time in incremental sequential pattern mining is an important research topic in sequential pattern mining, an incremental mining method for the sequential patterns is proposed in the article based on the analysis of the characteristics of sequential pattern mining. Meanwhile, with the privacy preserving problems in data mining have become increasingly prominent, and sequential pattern mining in the privacy preserving is also assumed greater importance to this end, the article also put forward a solution to the sequential pattern mining in the privacy preserving methods.For the problems of generating a large number of candidate sets in the incremental updates for the sequential patterns mining algorithms and repeatedly scan the database, an efficient incremental update algorithm SPIU2SM is proposed, this algorithm used the ESPE which is a mining algorithm based on the 2-sequence matrix to scan the data in the original database and increase database only once, then generate sequence patterns. SPIU2SM uses the 2-sequence and 2-sequence matrix to re-encode the sequence data, in this way, the space complexity of the update algorithm is reduced. Then, through the corresponding pruning frequent patterns and non-frequent patterns, the number of sequence comparison and scanning is reduced, the time complexity of update algorithm is also reduced. The experiments show that the algorithm is effective and accurate.For the evils of the random hiding algorithm in sequential pattern mining for privacy preserving which is in need to make substantial changes to the original data, PPSM (sequential pattern mining algorithm for privacy preserving) is proposed. On the one hand, the algorithm reduces the time complexity by the pretreatment of the sensitive sequence of patterns which need to been hidden, on the other hand, by searching for common support as a priority amendment to delete items, the algorithm reduces the changes to the original sequence data, thereby enhancing the implementation of the algorithm efficiency and reducing the ratio of the original data changes. Experimental performance analysis and experimental results both show that the algorithm is effective and accurate.
Keywords/Search Tags:data mining, sequential patterns, incremental updates, privacy protection, modify rat
PDF Full Text Request
Related items