Font Size: a A A

Research And Implementation Of Malware Classification

Posted on:2012-09-12Degree:MasterType:Thesis
Country:ChinaCandidate:Z H FangFull Text:PDF
GTID:2218330341951681Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Quick and accurate classification of malicious code is a key to prevent malware, for it is an important basis for detecting, controlling and removing malicious code. At present, the classification of malicious code has become one of hot topics in security field. This paper focuses on the classification of malicious code.There are some defects existing in classification methods of malicious code. Firstly, it is too slow to deal with the abound size of malware sets that anti-malware companies are confronted with. And little methods have been put into practical application. Secondly, the scaling ability is not good. Some approaches use anti-virus software to train the samples set as the benchmark category, resulting to such methods cannot identify untrained category. The third of defects is imprecise, either because it does not capture a sample's behavior well enough or because the limitations of the analysis technology itself. Imprecise in this context either means putting samples of different types into the same group or failing to recognize similar malware programs.On the basis of an in-depth analysis of existing malicious code analysis technology, by analyzing of large quantities of malicious code, this paper put forward the sequence of running behavior as sample's features, constructing the behavioral knowledge base of malicious code, designing and developing the malicious code classification system, Then the experiment results are given. It mainly covers the following works:1. Collect and analyse of malicious code, and design the automated behavior analysis system of malware. Based on Zero Wine a open source software, we design automated sample analysis system, and generate sample behavior analysis report. Considering that the software may occur analysis of abnormal problems when it encounters some shell samples, these samples should be shelling-off and decrypted before the behavior analysis.2. Feature extract and construct behavioral knowledge base. By analyzing the samples'behavioral reports, this paper put forward the sequence of running behavior as sample's features, adding the samples'behavior information to the database and constructing the behavioral knowledge base of malicious code.3. Based on clustering algorithm, this paper construct benchmark category of malicious code and prototype for malware families. Clustering of behavior, which aims at discovering novel classes of malware with similar behavior. And classification of behavior, enables assigning unknown malware to known classes of behavior. First, we map samples' behavioral charateristics to a high-dimensional feature space. And secondly, we generate baseline categories of malware using clustering algorithm, and extract prototype of family characteristics, called the gene code, that is common and universal features present in the malware and its variants, information that identifies the malware family. Finally, we classify the malicious code based on family genetic codes.4. Introducing incremental analysis methods to update the database of family genetic code, which enables enhance the scalability of the system. The database of genetic of code generated by during a period may not be applicable to a long time, in need of regular updating. Traditionally, The new samples were added in some or all of the sample sets that have been set up in the past rebuilded as the new set of training samples, and re-training to produce the new database of gene code. To avoid repeating learning and space-time spending problem, the introduction of incremental analysis method, that is, after the classification of new samples, some samples that have been uncategorized should be cluster analysis, extract the genetic code and update the database, and to classify the samples.5. Design and implement of malware classification system. Based on the characteristics that the code and behavior of malware family have a high similarity, we research on the related key technology, design and complete a malware classification system.6. Accurate testing of the system and comparative testing of algorithm. The results show that the system has a high accuracy, and achieve the expected results.
Keywords/Search Tags:Malware, Classification, Cluster Analysis, Classification Analysis, Behavioral Knowledge Base, Feature Extraction
PDF Full Text Request
Related items