Font Size: a A A

Design And Implementation Of The Phone Virus System Based On Sequential Patterns Mining

Posted on:2014-02-02Degree:MasterType:Thesis
Country:ChinaCandidate:W L CuiFull Text:PDF
GTID:2248330398972437Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
From the concept of data mining is proposed, the ideas and methods of data mining has been a mature application in many ways, including computer Internet industry, bio-science industry and the financial industry, etc. Especially in today’s information age, data showed explosive growth, so data mining has become an indispensable tool for extracting useful information from massive data.Data mining association mining algorithm has proven to be an effective algorithm in many areas, however, due to the algorithm itself, association rules can’t express the timing relationships between transactions, to solve this problem researchers propose a sequential pattern mining algorithm. The paper has made a depth study of the sequence mining algorithm, including the horizontal direction frequent sequential patterns such as GSP, AprioriAll, PrefixSpan algorithm etc, and the vertical direction patterns SPADE algorithm. However, these basic algorithms have the shortcoming of high space and time complexity.For the above-mentioned problems, Firstly, according to the actual data of the Siemens phone virus mining system project, propose practical solutions based on the theory of data preprocessing in data mining, to make sure mining algorithm can be applied in actual data; Then, based on the SPADE algorithm thinking, using a multi-tree structure to produce the set of frequent sequential patterns, and combined with the theory of closed sequential pattern, propose a new closed sequential pattern discrimination method, which is the core of the paper. By compared experimental, Illustrates the performance of closed sequential pattern mining algorithm based on S_List(SLCSP) and the classic CloSpan algorithm from the views of efficiency and effectiveness. The IBM sequence data generator experiments illustrate SLCSP algorithm compared CloSpan algorithm superiority in efficiency and the experimental results demonstrate the effectiveness of the algorithm in the actual virus data mining.
Keywords/Search Tags:Data mining, Data preprocessing, Sequential pattern, SPADE, Closed Sequential Pattern
PDF Full Text Request
Related items