Research And Implementation Of Malicious Domains Detection Technology Based On Big Data Analysis

Posted on:2019-04-01

Degree:Master

Type:Thesis

Country:China

Candidate:C X Yin

Full Text:PDF

GTID:2348330545955598

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

Cyber security is a topic we can never evade.Criminals often use domains as a means of spreading Internet attacks on the Internet,such as connecting to Trojans and botnet communication.Some techniques such as Fast-Flux and DGA make network attacks more hidden and malicious domain names harder to recognize.Domain blacklist strategy is very difficult to play a role in this case,and it is a more efficient way to identify malicious domain by analyzing the DNS data of domain.This paper first investigates the related techniques of malicious domain detection,analyzes the difficulties faced by malicious domain detection,summarizes the existing technical solutions and related research results,studies the machine learning classification model and big data technology,and uses Hadoop,Spark,Kafka,etc.to set up a big data analytics infrastructure.On this basis,this paper starts with the DNS data and constructs a malicious domain detection model based on DNS behavior features by using machine learning.This paper analyzes the statistical distribution of DNS data from multiple perspectives and extracts 22 features of four dimensions.Cross-validation is used to compare two classification models of random forest and GBDT.The tests show that random forest has advantages in accuracy,recall and other indicators.Finally,a malicious domain detection system is designed and implemented based on big data platform,and the constructed detection model is used in the system.The system architecture to be designed takes a series of questions into consideration,such as input source,data storage,execution efficiency and scalability,finally it is divided into 4 functional modules.In order to ensure that the system can be stable and available in high-speed networks,many performance optimization solutions have been adopted.Use network traffic diversion model to improve high-speed network traffic capture ability.Optimize Kafka configuration to deal with the short-term surge in network traffic and improve system throughput and stability.Whitelist filtering will filter data of popular legal domain name that account for the majority of DNS data,thereby reducing the follow-up module data processing pressure.Preprocessing module aggregates domain information and writes results to MongoDB at regular time to reduce the HDFS data repeatedly read and processed,and so on.The model designed and the system implemented in this paper are deployed in the actual network for detecting malicious domain online.After testing,the system has achieved a good detection accuracy and detection efficiency.

Keywords/Search Tags:

malicious domains, machine learning, big data, online detection

PDF Full Text Request

Related items

1	The Study Of Malicious Code Detection Based On Data Mining And Machine Learning
2	Research On Malicious Domain Detection Under Hadoop Environment
3	The Research And Implementation Of Incremental Parallel SVM In Malicious Domain Detection
4	Research On Malicious URL Detection Technology Based On Machine Learning
5	Research On User Malicious Comments Detection Based On Machine Learning
6	Research On Malicious URL Detection Based On Machine Learning
7	Research Of Distributed Malicious Web Site Detection Model
8	A Cellphone Malicious Behaviors Research Based On Mobile Base Station Data
9	Malicious Code Detection Technology Based On Machine Learning Algorithm
10	Leveraging Temporal Pattern To Detect Malicious Accounts In Online Social Network