Font Size: a A A

Research On Improvement Of Algorithm Prefix Span For Sequential Pattern Mining

Posted on:2018-09-12Degree:MasterType:Thesis
Country:ChinaCandidate:T YangFull Text:PDF
GTID:2348330515975365Subject:Computational Mathematics
Abstract/Summary:PDF Full Text Request
Sequtential pattern mining is an important branch in data mining research.Besides,the research of algorithms about sequential pattern mining is also very vital.In this thesis,the classic algorithms of sequential pattern mining are researched deeply and carefully in detail,and some research results are shown as follows:Firstly,the classical algorithms of sequential pattern mining such as AprioriAll,GSP,SPADE and SPAM are compared and analyzed.Secondly,the sequential pattern algorithm PrefixSpan based on idea of pattern growth is studied deeply,the theory evidence about repeated projection database phenomena in sequential pattern mining process is proved,the supremun and infimum number of projection operation of the algorithm process are derived,and a general formula is given for repeated projection database number counting at the worst case of the algorithm.Thirdly,based on the idea of prefix projection and reference of the data structure of algorithm SPAM,a 2-D table is used to store the position information of frequent items in sequence database.And then,to count the sequence support and confirm their frequences fastly,a new algorithm is used to calculate non-empty position information set of the 2-D table at column direction only.At last,in order to avoid the same cast shadow table to be used repeatedly during the sequential pattern mining process,examination of the position of prefix sequence is done before.
Keywords/Search Tags:Data mining, Sequential pattern, Apriori algorithm, PrefixSpan algorithm
PDF Full Text Request
Related items