Font Size: a A A

Malware Detection Based On Sequential Pattern Mining Algorithms

Posted on:2010-01-23Degree:MasterType:Thesis
Country:ChinaCandidate:L N WangFull Text:PDF
GTID:2178360302959752Subject:Spread and control of network systems
Abstract/Summary:PDF Full Text Request
Malware detection is an important area in the computer security. This paper analyzed the advantages and disadvantages among nearly three decades various detection methods, found that most signature-based misuse detection has revealed its defects when a large population of malicious code appears today, and anomaly detection, while having the characteristics of active protection, but its accuracy can not satisfy the practical requirements. For these deficiencies, with the background of host malicious code detection, this paper presents a method which is combined of data mining and expert system to detect malware. The innovation of this method is the three characteristics of our detection system: 1, base on malware behavior signatures; 2, combined with expert system technology; 3, use data mining algorithms to explore behavioral patterns of malware.The main problem of how to apply sequential pattern mining algorithm in a malware detection expert system is studied in the paper. The main steps: extract malicious behaviors from virus samples with SSM and EQSecure tools, to form sequence view behaviors database; then, use PrefixSpan to mine sequential patterns (behavior signature) in behavior sequences to form behavior pattern database; in the end, the expert system infers and matches with facts and rules, and gives the final results.My main work is listed below:(1)Build a sequence database of malware behaviors: use SSM and EQSecure tools to extract behavior sequences from malware samples, to form sequence view behavior database.(2)Improve PrefixSpan sequential pattern minning algorithm: Most of the previously developed sequential pattern mining methods are Apriori-like, which still encounters problems when a sequence database is large and/or when sequential patterns to be mined are numerous and/or long. So this paper proposes a better sequential pattern mining algorithm, called PrefixSpan B, which uses brief projected database in stead of projected database of PrefixSpan. Our experiments verify that the improved algorithm is faster than the old one.(3)Application of sequential pattern minning algorithms in malware detection system: PrefixSpan B algorithm helps to analyze"behavior signature"of malware, to make the Knowledge Base more comprehensive and effective. The experiments shows the correctness and effectiveness of the improved algorithm, simultaneously, behavior pattern database is far smaller than the one without any mining algorithm,and different length patterns can be selected as detection rules.
Keywords/Search Tags:malware detection, sequential pattern mining, PrefixSpan algorithm, projected database
PDF Full Text Request
Related items