Research On An Algorithm For Time Sequential Pattern Mining

Posted on:2008-02-05

Degree:Master

Type:Thesis

Country:China

Candidate:Y Yan

Full Text:PDF

GTID:2178360242958955

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

Data Mining has become one of the fast growing areas of research in recent years. Besides association rules mining, researchers endeavor to develop mining methods with time factor considered. Popular research topics include customers bying patterns analysis, Internet surfing time-series analysis, trend analysis,and so on. When probing the customers buying time-series patterns, most developed mining methods require repeated database scans to generate candidate patterns, which are then checked to find frequent time-series patterns.The mining of sequential patterns is one of the hottest spots in the field of DM. The purpose of sequential patterns mining is to find the frequent sequences in transaction databases and then use these patterns to help decision-makers. The concept of sequential pattern is introduced to capture typical behaviours over time, i.e. behaviours sufficiently repeated by individuals to be relevant for the decision maker. If we are given a database of sequences, where each sequence is a list of transactions ordered by transaction-time, and each transaction is a set of items. The problem is to discover all sequential patterns with a user-specified minimum support, where the support of a pattern is the number of data-sequences that contain the pattern. The excuting efficiency is one of the important problem in the data mining .The AprioriAll algorithm is the method of finding sequence patterns, but has the disadvantage in the complexity of space and time. Therefore, this dissertation introduces a new algorithm based on adjacency matrix that does not need to produce the candidate item sets. This algorithm produces frequent pattern by joining suffix with prefix, consequently avoids scanning the database many times, and lowers the time expense.In this paper, we present an approach for mining sequential patterns embedded in a database. The algorithm can mining sequential patterns over a database of sequences .In the algorithm, we use a new data structure and we name it "sequences thread tree". Then we discuss the algorithm in detail. We experimented on the function of the algorithm using several synthetic data.Key algorithm are tested and verified. Parameters impacts on the performance and results of mining parameters are experimented and analyzed. The performance of TTSP and FPAM are compared and empirical evaluation indicates that the incremental idea of the algorithm is right and is much faster than the normal mining. At the same time, the algorithm scales linearly with the number of data-sequences, and has very good scale-up properties with respect to the average data-sequence size.

Keywords/Search Tags:

data mining, sequence patterns, sequences thread tree, incremental data mining

PDF Full Text Request

Related items

1	An Efficient Algorithm With Incremental Data Mining For Mining Sequential Pattern(NPSP)
2	Research On Mining Algorithm Of Web Log Frequent Sequential Patterns
3	Study On Frequent Pattern Mining Algorithms And Pruning Strategies
4	Research On Mining Periodic Frequent Patterns Common To Multiple Sequences
5	Mining Shared Knowledge\Patterns Between Two Datasets
6	Research On Key Techniques Of Mining Negative Sequential Patterns Based On Non-occurring Items
7	Research On Frequent Subtree Mining
8	Enhanced PL-WAP tree method for incremental mining of sequential patterns
9	Incremental Web Data Mining System Based On Classification Tree
10	Data Mining Algorithm And Its Application In The Tourism Industry