Font Size: a A A

Research On Time-interval Weighted Sequential Pattern Mining Algorithms For The Software Vulnerability

Posted on:2014-06-11Degree:MasterType:Thesis
Country:ChinaCandidate:D X ChenFull Text:PDF
GTID:2268330422466888Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the development of sequential pattern mining technology, the weightedsequential patterns have been widely applied to many aspects such as biomedical science,economics, networks, and become a central issue of data mining. The existing weightedsequential pattern mining algorithms usually use the pre-prepared weights and ignore ordo not make good use of the time and time-interval information of data elements. Andbecause the algorithms are based on single item weight, they do not reflect the wholeimportance of the sequences. Besides, some algorithms require to scan the database manytimes or build temporary databases, the memory usage is huge. The traditional sequentialpattern mining algorithms can not apply to data stream, and the existing algorithmsapplied to data steam do not take the time-interval weights into consideration neither. Tosolve these problems, in this paper we make a deep research on weighted sequentialpattern mining algorithms.Firstly, we propose a memory-based algorithm MITWCSpan (Memory Indexing fortime-interval Weighted Closed Sequential pattern mining) for time-interval weightedclosed sequential pattern mining. The algorithm takes full account of the importance of thetime-interval of data elements. Moreover, an improved index set based on time-interval,p-tidx, is defined. During mining process, the algorithm adopts the find-then-indextechnique recursively to find the items which can constitute a time-interval weightedsequential pattern and construct p-tidx for the possible sequential pattern. Finally thealgorithm uses closing detection to get the whole time-interval weighted closed sequentialpatterns.Secondly, for data stream, we proposed a weighted closed sequential pattern miningalgorithm with time-interval. The algorithm takes the weight constraints into considerationcombined with the sliding window and divide-and-conquer ideology, and uses less time.Meanwhile, in order to mine more compact result, The algorithm uses a closing detectiontechnique to prune the dataset and get a more important set of sequential patterns.Finally, combined with the real world applications, we applied the algorithm tosoftware security detecting with a real instance.
Keywords/Search Tags:weighted sequential pattern, time-interval, sliding window, softwarevulnerability
PDF Full Text Request
Related items