Font Size: a A A

Based On The Malicious Web Page To Learn Intelligent Detection System

Posted on:2012-01-28Degree:MasterType:Thesis
Country:ChinaCandidate:S WangFull Text:PDF
GTID:2208330335486274Subject:Pattern Recognition and Intelligent Systems
Abstract/Summary:PDF Full Text Request
With the rapid development of internet and the abundance of network resources, websites emerged in multitude. However, a large number of websites contain malicious code. If these websites are browsed, the malicious code will be unknowingly added to the user's computer system, so that the system is infected and destroyed.In this paper some knowledge of malicious code is introduced, and the working principle of the websites of malicious code is also analyzed. Rendering web pages is, in essence, the process that codes are executed by the browser. As long as malicious code is added among normal paragraphs, web pages will become destructive and malicious. Now commercial anti-virus softwares use "signature" detection technology, but they can only detect malicious codes that are known. Machine learning methods make use of known malicious and normal code, not only to detect known malicious code, but also to detect unknown malicious code very well. The back propagation algorithm and the decision tree algorithm are adopted to train classification in this paper, whose performance has a great relationship with characteristics of samples. We compare malicious code with normal code to summarize the typical 14 characteristics, and then train the classifiers with these characteristics and their corresponding labels.We get web pages by Web crawler in the experiments to collect and label samples (Javascript codes) by the data acquisition module, and extract features which are used to train and test classifiers by the module of training and validating classifiers.The above research and experiments prove that the detection technology of malicious code based on machine learning is efficient and accurate, and also demonstrate that the fourteen features we define are representative and influential. So the technical support for the detection of malicious web pages is provided.
Keywords/Search Tags:malicious code, machine learning, classifier, Javascript, Web crawler
PDF Full Text Request
Related items