Font Size: a A A

Research On Frequent Pattern Mining Algorithms Based On Bit Vectors In Data Streams For The Software Vulnerability

Posted on:2013-04-13Degree:MasterType:Thesis
Country:ChinaCandidate:Y Y ZhangFull Text:PDF
GTID:2248330392454705Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of information technology and the continuousimprovement of getting data by people, data stream as a new data model more and moreappear in a variety of applications, such as call records in telecommunications, the retailchain of online trading and network traffic analysis and so on. Because data stream is anunlimited sequence of transaction arriving at a high speed continuously and changing withtime, which led the traditional mining pattern is no longer feasible. Therefore, how to storedata stream efficiently and mine frequent pattern quickly has become a hot topic. In thispaper, a new algorithm for mining frequent pattern in data stream and uncertain datastream is studied.First, an algorithm based on bit vector decomposition and hash linklist for miningfrequent patterns in data stream is proposed. In this algorithm, the arrival transactions areconverted into bit vectors, and permutations and combinations are used to decompose theconverted results, then the decomposed itemsets are stored in the hash linklist, Afterwardsthe property of anti-monotonic will be used to prune the infrequent itemsets.Second, this article proposes an algorithm based on bit vector table and compressedtree in uncertain data streams. the uncertain data streams are initialized toprobability-vector table, in the table, the items are represented by transactions, meanwhile,it proposes compressed tree in which the items with different probabilities are stored in thesame tree nodes, which will reduce the number of tree nodes significantly, moreover, eachleaf node of the tree is connected to an array which is used to store the combination of allitems and their expected support in the path.Finally, a software vulnerability analysis algorithm based on frequent pattern miningis proposed. The algorithm first collects the existing software vulnerability programs andconverts them into software vulnerabilities sequence; then the algorithm converts thesequence into software vulnerabilities itemsets; after, according to the different forms ofsoftware vulnerabilities, we adopts the different frequent pattern mining algorithm, themined frequent software vulnerabilities are given a higher priority.
Keywords/Search Tags:data stream, frequent pattern, bit vector, hash linklist, compressed tree
PDF Full Text Request
Related items