Font Size: a A A

Threat Intelligence Analysis Of Dark Network Based On Machine Learning

Posted on:2021-01-25Degree:MasterType:Thesis
Country:ChinaCandidate:H H YuFull Text:PDF
GTID:2428330602997043Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The Internet has promoted the interaction between human beings in the world with unprecedented breadth and convenience.However,the emergence and maturity of dark network have seriously threatened our social and public security.Therefore,it is of great significance to study how to explore the network space of dark network.The domain name of the dark network is different from that of the open network.It has the characteristics of non-public,short-term existence and fast update.Therefore,it is difficult to identify the domain name of the dark network and the dark network market.It is difficult to obtain threat information and complex content distribution.The analysis and composition of the dark network are not clear.Therefore,based on these problems,this paper conducts data collection and content analysis of dark network.The system includes the following parts:(1)Aiming at the problem of dark web data,this paper first crawls the hidden service content,then classifies the crawled content,uses some dark web crawler strategies,uses the scrapy framework for crawling,and finally designs and implements the crawling of the dark web data.(2)Aiming at the problem of identifying sellers in dark network market,this paper designs an analysis model of dark network market.Firstly,the data source identification is carried out,and the key words,snowball and deep weep hiding service methods are adopted;then,the data is collected,and the assets are analyzed through the collected data,so as to realize the active acquisition of Network Threat Intelligence.(3)To solve the problem of domain name and address collection,this paper designs a dark network domain name aggregation system,which uses three ways to collect: dark network directory aggregation,tor2 web domain name keyword aggregation and social networking site aggregation.First of all,the collection of dark web directory aggregation is the main source of dark web domain name collection.Secondly,the algorithm of discovering specific keywords is proposed,which is carried out through tor2 web.Finally,the scrapy crawler framework is used to obtain the domain name address from the published content of reddit's social network website.At the end of this chapter,we test the aggregation of dark network domain names.The test mainly uses keyword search method,and the search keywords use three search engines: torch,duckduckgo and ahmea.(4)Aiming at the low accuracy of KNN classification,the KNN algorithm is improved.At present,most of the algorithms used in the dark network data classification only extract a single feature value of the text,and do not consider the relationship between the texts.In view of this situation,an improved KNN algorithm based on association rules is proposed.Firstly,the Apriori algorithm is improved,then the association rules of the improved Apriori algorithm are carried out,the frequent itemsets are extracted,and the k-nearest neighbor is determined.Finally,the dark network data classification is carried out with KNN algorithm.At the end of this paper,the KNN algorithm based on association rules is verified by experiments,which improves the accuracy of dark network data classification,and proves that the improved Apriori algorithm,combined with KNN algorithm,is more effective for dark network classification.
Keywords/Search Tags:Dark web, Tor, Domain name address, Hidden service
PDF Full Text Request
Related items