The Research Of Malicious Web Pages Detection Based On Multiple Features

Posted on:2014-09-18

Degree:Master

Type:Thesis

Country:China

Candidate:T Yue

Full Text:PDF

GTID:2268330425983781

Subject:Computer Science and Technology

Abstract/Summary:

Webpage technologies are becoming more and more popular with the prevalence of Internet. Webpage users are threatened by various types of malicious webpages, particularly for phishing webpages, spamming webpages and malware webpages since they have their own characteristics. It is generally difficult for users to distinguish malicious webpages, thus current researches on malicious webpage detection and type recognition need to be further improved. Feature extraction methods of webpages are the key procedure for malicious webpages detection. This paper focuses on investigating and analyzing the feature extraction methods for malicious webpages, and proposes a new method for webpage feature extraction and has realized a system for detecting malicious webpages. Principle contributions of this paper include:This paper has discussed and analyzed existing feature extracting methods of webpages. Aiming at addressing the shortcomings, a feature extraction method for malicious wabpages detection based on webpage source codes and URL properties is proposed. The method uses the static analysis to extract the features of webpage codes and script information, and also gives an analysis on the URL to extract the text vocabulary features and the related host property features, and then represents these feathers in the form of numerical feature vectors. Comparative experiments using the proposed method and methods in existing literatures are conducted on the specific datasets, and general evaluations are made from the perspective of detecting the system accuracy.This paper has designed and realized a system for detecting malicious webpages based on the proposed feature extraction method. The system utilizes the webpage collecting block to gain the datasets of webpages. The feature extraction block utilizes the extracting method proposed in this paper to do feature extractions for the webpage dataset and then builds a webpage feature library. The data storage block is applied to store the related data of webpages into the disks. The detecting classification block introduces the k-nearest neighbors algorithm and SVM to do the detection, then utilizes the KD-tree algorithm to optimize KNN to reduce the timing overhead. Experimental is given to analyze performance of the system and the timing overhead for detection.

Keywords/Search Tags:

Malicious Web Pages, Feature Extraction, URL properties, Detection

Related items

1	The Research And Implementation Of Malicious Web Pages Detection
2	Analysis And Detection System Of High-obfuscated Malicious Web Pages
3	The Monitoring And Precaution Of Malicious Web Pages And Its Security Threats
4	Research And Implementation Of Detecting System For Malicious WEB Pages
5	Detection Of Malicious Web Pages Based On Script Static Analysis
6	Mining The Link Spamming And Malicious Web Pages Based On Topology Structure Of Massive Internet Web Pages
7	Research On The Detection Of Malicious Web Traffic In Cloud Platform
8	Fine-grained Behavior Capture And Malicious Detection Of Web Pages
9	Automated Tracking And Analysis System Of Intelligent Malicious Web Pages
10	Clustering Analysis Of Malicious Code Based On N-gram Feature Extraction