Font Size: a A A

Research On Models Of Sequential Patterns Mining

Posted on:2008-01-26Degree:MasterType:Thesis
Country:ChinaCandidate:Y Y ZhangFull Text:PDF
GTID:2178360215951391Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The quick development of information technology leads a rapid increase in all kinds of data. Just these cases bring birth to the KDD and Data Mining which provide people a new approach to understand data. The discovery of sequential patterns is an important field in KDD. This dissertation researches on the problem of sequential patterns mining in databases. The main context is as follows:(1) The basic model of sequential pattern and several classical minging algorithms is discussed in detail. Based on these, the applicable future and posslibe challenge of sequential pattern mining is exhibited.(2) A Model of Frequent 2-Sequence Graph (F2SG) is presented. Based on F2SG, a new sequential pattern mining algorithm GBSPM is proposed. Traditional algorithms need to scan databases multiple times, spending lots of time in I/O. A graph model to express sequence information is presented in this dissertation. It can express the sequence information related to the mining task by scanning transaction database only once. The graph representation of database can fully utilize the property of item ordering in the mining process, thus improving the generation efficiency of frequent sequences. Moreover, it can process time constraints easily. Experiment results prove that it has better time performance than traditional sequential pattern mining algorithms.(3) A sequence clustering method using sequential patterns achieved is proposed. The method gives the definition of the similarity of data sequences and the mean of data sequence cluster, so the classical cluster method can apply on sequence data to discover a set of high quality data sequence clusters which contain similar sequential patterns. Theories analysis and experiments prove that it not only generates optimal clusters, but also exhibits good efficiency.
Keywords/Search Tags:Data Mining, sequential pattern, frequent sequence graph, sequence clustering
PDF Full Text Request
Related items