Font Size: a A A

Design And Implementation Of Real Time Detection System For Malicious Domain Based On Spark Framework

Posted on:2017-08-25Degree:MasterType:Thesis
Country:ChinaCandidate:J XuFull Text:PDF
GTID:2348330503972512Subject:Computer technology
Abstract/Summary:PDF Full Text Request
DNS(Domain Name System) is one of the Internet's important infrastructure, it is a distributed database that makes IP address and DNS together. Many Web services rely on domain name services to carry out, it enables users convenient to access network resources. DNS does not rely on its service to detect malicious bahaviour, so malicious programs use DNS to complete a lot of malicious network activities.The traditional method of domain name detection is using domain name blacklist, reverse technology, data mining and so on. But with the application of many new network technology, malicious use of the domain name becomes more and more flexible. Traditional detection methods can not effectively detect these malicious domain names. Therefore, in the detection of malicious domain name, we can analyze the DNS log records to find out the characteristics of malicious domain name. Extracting domain name, access time, TTL(Time To Live) value and IP address from DNS log, and then calculate the corresponding characteristic values, such as the survival of the domain name, number of countries,number of domain name.Using Spark can quickly extract these attributes from DNS log, and calculate the characteristics to form the sample set. Using Spark framework to achieve real-time detection of malicious domain name system. The system includes three modules, capture module, Spark cluster module and monitoring module. Using C4.5 algorithm to train samples to establish a detection model, and to achieve real-time detection of the domain name.Building a Spark and Hadoop cluster, using Spark and Hadoop to analyse DNS logon, it shows that the calculation speed of Spark is significantly faster than Hadoop. Testing the ability of Spark to receive the domain name data in real time,which indicates that Spark can quickly obtain real-time data stream. Finally testing the ability of real time detection system for the domain name,Spark completely overcomes the shortcomings of C4.5 algorithm. The model in real time detection has higher accuracy and recall rate. it shows that the detection system can effectively detect the malicious domain name on Spark.
Keywords/Search Tags:malicious domain, real-time detection, big data, Spark, C4.5 algorithm
PDF Full Text Request
Related items