Font Size: a A A

A SVM-based Method For Detecting Computer Virus

Posted on:2012-06-06Degree:MasterType:Thesis
Country:ChinaCandidate:W ChenFull Text:PDF
GTID:2218330374453431Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
The proliferation of malware in recent years has presented a serious threat to computer system. As viruses become more and more complex and sophisticated, the traditional static scanning techniques for malware detection have limitations for polymorphic viruses. The static scanning technique is primarily based on static analysis of virus and to choose the machine codes or strings in the virus as its signature. When scanning a new file, the virus detection program searches the virus signature database to find if there are matched signatures. In order to detect unknown viruses, the machine learning method is applied to virus detection, and some progress has been made. In this paper, we propose a polymorphic virus detection method based on support vector machine (SVM) in the Windows platform.Our approach rests on an analysis using the Windows Application Program Interface (API) calling sequence that reflects the behavior of a particular piece of code. The process of extracting API calling sequence from a running program is very complicated. In this paper the executable file format on Windows system and the process of Windows loading an executable file are firstly discussed in detail. The next chapter the complete process of the system routine calling is analyzed. The process starts from the user mode function calling to the operating system kernel implement the corresponding API. In this paper several existing methods for extracting API sequence are compared, and finally the virtual machine method is elected to extract system calling sequence of the virus. A virtual CPU and other resources which are needed in application running, which consist a virtual machine, are simulated to execute the malware. Most of Windows are running on Intel x86 series CPU hardware platform. Therefore, this paper studies the x86 instruction format in detail. Firstly, the virus is disassembled into machine code. Then these instructions are loaded into the virtual machine. During the running process, the API calling sequences are recorded. The API calling sequences of the program reflect the behavior of the program. In order to using support vector machines to detect virus, the first step is to convert the API calling sequence to vectors which can be recognized by support vector machine. Through the pre-defined relationship between the API and corresponding ID, the API calling sequences can be converted into vector form. In order to get higher accuracy of detection rate, the cross-validation and grid search method are employed to find the kernel function parameters in the experiment.The results indicate that compared to the existing commercial anti-virus software, the detection method based on support vector machine improves the detection rate of unknown viruses to some extent, in particular for the polymorphic virus and virus variants.
Keywords/Search Tags:API sequence, computer virus, support vector macine, virus detection
PDF Full Text Request
Related items