Font Size: a A A

Research On Parked Domain Detection Based On Domain Semantic Analysis

Posted on:2019-10-24Degree:MasterType:Thesis
Country:ChinaCandidate:J J JinFull Text:PDF
GTID:2428330548952317Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the popularity of Internet applications,the number of registration of domain names is increasing year by year.According to the data statistics of China Internet Network Information Center(CINCC),we can know that the growth rate of domain names is far more than the growth rate of websites,a large number of domain names are parked domain.The parked domains don't provide effective information resources to users,but mainly make profits by displaying advertisements list and selling again.The parked domains seriously affect the user's experience of accessing the Internet.Therefore it's urgent to do some research on parked domain detection.At present,there are few studies on the parked domain detection,and the existing parked domain detection method is mainly based on the analysis of webpage.This method needs to get webpages with deficiencies,such as large amount of data,low detection efficiency,bad real time of detection and so on.In view of the shortcomings of the existing research,we proposed a parked domain detection framework based on semantic analysis in this paper.Through the analysis of the domain based semantic features,we evaluate the DNS's risk of providing parking service,and combine it with the result of dimain name resolution and the webpage features for detection the parked domain.It can effectively improve the real-time and efficiency of the detection of the parked domain.The detail research work of the paper is in the following aspects:1)We put forward a parked domain risk evaluation method for DNS based on the domain semantic.Beacause there are significant differences in semantics,we randomly select some domains from the DNS,then group N domains with a domain text,and construct a SVM classifier for domain text categorization.In order to ensure the objectivity of evaluation,we construct multiple domain texts for the same DNS,and detect the domain texts with SVM classifier.Finally,according to the ratio of parked domain text,we get the parked domain risk coefficient of DNS.The result shows that the DNS's risk coefficient assessment can be used as an important indicator of detection,and provide the foundation for subsequent detection.2)We put forward a parked domain detection model based on webpage image features.The parked domain is mainly used for displaying advertising list and selling domain,there are significant differences in webpage image feature.We sum up the webpage image feature based on webpage screenshots,and construct a parked domain detection model through convolutional neural network CNN.The results show that the model can detect the parked domain effectively3)Based on the DNS's risk assessment model and the webpage image detection model,we propose a parked domain detection framework.Because of the work principle of parked domain,there are a large number of domain names resolved to a small number of IP address.According to the authoritative DNS server of high risk,we randomly select some domain names and resolve them.We get IP addresses which include a large number of domain names,use the webpage image detection model to detect the domain names on those IP addresses,and build the parked domain IP address database.When detecting a domain,we get the DNS of the domain name,and make a judgement based on the value of the parked domain risk coefficient of DNS server,if the risk coefficient is below a specified threshold,directly determine the domain name as a no-parked domain;Instead,if the risk coefficient is higher than the other threshold,we resolve the IP address,if the IP address belongs to the parked domain IP address database,directly determine the domain name as a parked domain;otherwise,we need for further detection by webpage detection model.The results show that the framework can effectively reduce the number of domain names to obtain the webpage,and improve the efficiency of the detection.
Keywords/Search Tags:parked domain, parked domain detection, parking service, domain semantics analysis, typosquatting
PDF Full Text Request
Related items