Font Size: a A A

Mining Sequential Patterns With One-off Condition

Posted on:2016-02-14Degree:MasterType:Thesis
Country:ChinaCandidate:P ZhaoFull Text:PDF
GTID:2348330536487049Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
There is a huge wealth of sequence data available in real-word applications,it is a worthwhile goal to analyze the wealth of data and mine important knowledge.How to mine frequent sequence patterns in sequence databases is an important task of sequential pattern mining.With the rapid development of information technology and the rapid popularization of Internet,sequential pattern mining has become a very important research task in the field of data mining.Mining sequential patterns with one-off condition is one of the problem of sequential pattern mining with wildcards,it is an improvement on the traditional pattern mining,as it allows wildcards to exist between elements of the frequent patterns and the frequent patterns must meet one-off condition.It is not only an important theoretical value for the introduction of one-off condition in the sequence pattern mining,but also in the practical application,it is not necessary to find all occurrences of the pattern P.So this paper mainly focuses on the research of mining sequential pattern with one-off condition algorithm in order to improve the efficiency of mining frequent patterns.The main tasks of this paper are as follows:1.Mining sequential pattern with one-off condition algorithm of this paper is based on pattern matching,We designed three kinds of algorithms by using Nettree structure,they are Cal-SRMP(Calculating Support with Strategy of Right Most Parent),Cal-SGSP(Calculating Support with Strategy of Greedy-Search Parent)and Cal-SBO(Calculating Support with Selecting Better Occurrence).Cal-SGSP algorithm uses Strategy of Greedy-Search Parent to find an occurrence of the pattern,each step is to find an approximate optimal parent of the current node.Cal-SRMP algorithm uses the most right parent strategy,every step of the current node to find the right parent node as the current position of the model.Cal-SBO algorithm is preferred to use two strategies to find two occurrences with the same leaves and select the occurrence with the smallest number of related occurrences as Cal-SBO algorithm results,finally,Cal-SBO algorithm returns the support number of the candidate pattern.2.The framework of the sequential pattern mining with one-off condition algorithm is presented.The algorithm is based on pattern matching algorithm,and the support numberof candidate patterns is calculated by using the above three methods,and finally,the frequency pattern is determined by the support number of candidate pattern.The three algorithms are SBO-Mining,SRMP-Mining and SGSP-Mining.Also in order to avoid too much candidate patterns to be detected,algorithms use the apriori property to prune to reduce the calculation of the support number of candidate patterns.3.A large number of comparative experiments are designed,and the experimental results are analyzed from two aspects: the mining results and the efficiency of the algorithm,and the effectiveness of the proposed algorithm is verified.
Keywords/Search Tags:sequential pattern mining, frequent pattern, One-Off condition, pattern matching, Nettree
PDF Full Text Request
Related items