Research On Malicious Webpage And PDF Document Detection Based On SVM Model

Posted on:2015-10-10

Degree:Master

Type:Thesis

Country:China

Candidate:S J Yang

Full Text:PDF

GTID:2298330467988807

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

The Internet brings people more convenient and faster information service thantraditional service. On the other hand, the openness and vulnerability of the network provideshacker convenience. Currently, among the various means of the network attack, the mostpopular is embedding exploit code in benign webpage, and then downloading the maliciousexecutable program automatically without users knowledge. And this means of attack hasposted a serious threat to the Internet security. The traditional anti-virus engine is hard todetect the obfuscated malicious code in Web page or PDF document, because of the staticsignatures can only match those readable and non-encrypted codes. Besides, the staticsignature database is increasing over time without endless. For this reason, It is promising tostudy a new detection technology for identifying malicious obfuscated code embedded in Webpage or PDF document.In this paper, the structure of web page and PDF document is analyzed firstly, then thesupport vector machine which is based on statistical learning theory is induced to train thefeatures of test samples for learning a classifying model. And the dynamic emulation tool isapplying to execute shellcode which may embedded in malicious JavaScript for analyzing itsspecific behaviors. By using above technology, the obfuscated malicious codes in web page orPDF document can be detected. The main work in this paper is as follows:(1) An overview of the attack and defense techniques of webpage Trojan are presented inthe chapter2. The Trojan s attack principle and typical attack means been introduced, as wellas the corresponding defense techniques, and indicates their advantages and disadvantages.(2) In order to overcome the weakness of the traditional anti-virus engine, this paperintroduce the SVM(support vector machine) based on statistical learning theory to detect thewebpage Trojan, instead of the traditional signature comparison approach. Specifically,extracting the suspicious JavaScript from the test sample firstly, then counting thosesuspicious characters in the extracted JavaScript for training the SVM specification. Finally,Applying the SVM classifer to divide the suspicious feature set into malicious type andbenign type.(3) The PDF analysis model is designed to analyze the stream objects in the PDFdocument. By applying static analysis technique, the suspicious JavaScript could be extractedfor the further detection with SVM classifier. (4) The dynamic emulation tool was introduced to help analyst learn more detailmalicious beheaviors about the shellcode where embedded in malicious JavaScript.

Keywords/Search Tags:

Webpage Trojan, Support Vector Machine, PDF document, JavaScriptengine, Shellcode

PDF Full Text Request

Related items

1	Gate-level Hardware Trojan Detection Method Based On Support Vector Machine
2	Research And Implementation Of Iis Webpage Trojan Detection System Based On Dom Model
3	Research On Hardware Trojan Detection Method Based On Machine Learning
4	Research On Some Problesm Of Support Vector Machine Learing Algorithm
5	Research On Technology Of Trojan Horse Detection Based On Behavior Analysis
6	Research And Implementation Of Web Page Classification Based On CNN And SVM
7	Classification For Webpage Trojan Detection Based On DOM Modeling
8	Classification For Webpage Trojan Detection Based On Dom Modeling
9	Design And Implementation Of Content-based Webpage Collection And Classification System
10	Study Of Webpage-Trojan Detection Technology