Font Size: a A A

Research And Application Of Dark Web Data Efficient Acquisition Technology

Posted on:2022-01-25Degree:MasterType:Thesis
Country:ChinaCandidate:L XiangFull Text:PDF
GTID:2518306524493774Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
The dark web,which is located at the bottom of the Internet's multi-layer structure,has become a breeding ground for illegal transactions and activities due to its own characteristics.It has seriously endangered network security,and has also brought extremely severe challenges to social stability and national security.Therefore,it is imperative to moni Tor and control the dark web.However,the dark web is different from other ordinary networks.Due to its concealment and the high value of data,it is extremely difficult to obtain data in the dark web.Based on this,an in-depth study on the difficulty of obtaining dark web data is made in this thesis.In addition,a dark web data acquisition system has been designed and implemented in this thesis.The main work contents are as follows:(1)In response to the problem of dark web domain names being difficult to obtain,this thesis proposes four domain name acquisition methods including two optimization methods and two auxiliary methods.First,manually browsing the catalog web pages to collect domain names.Second,improved onionscan scanning collection method.Third,improving the keyword search algorithm,and combining the Tor2 web project to collect domain names;Fourth,collecting domain names in dark web pages.A total of 13,854 domain names were obtained by the four methods of domain name collection.Among them,the method with the highest collection efficiency is the second method,which collects a total of 8176 pieces,with an average of about 341 pieces per hour.The least efficient method is the first method,with a total of 567 items,which is about 54 items per hour on average.(2)Aiming at the problem of difficult access to dark web data,this thesis proposes three methods for obtaining two types of data.First,obtaining user space data.Using the method of combining the Scrapy framework and Tor(the second-generation onion router)proxy to access the dark web and obtain web page data successfully.Second,In this thesis,two methods are designed to obtain cyberspace data.On the one hand,through node injection mode,grabbing data packets flowing through the node to obtain cyberspace data.On the other hand,using the result file scanned by onionscan to query the data information of the node with the help of the shodan tool.In the end,a total of 48091 pieces of user space data were obtained,of which data related to drugs and private information accounted for about 67%.A total of 6176 pieces of network space data were obtained,most of which were data of intermediate forwarding nodes,and a small part were data of server nodes.(3)The thesis proposes and implements a complete dark web data acquisition system.Combining the previous research on the acquisition of dark web domain name addresses and dark web data,a dark web data acquisition system is designed in this thesis,which includes three modules: domain name address acquisition,user space data acquisition,and network space data acquisition.It explains the design and implementation of each module,and uses the corresponding data table to store the acquired dark web data and domain names.In addition,three functional test cases are designed for the three modules.Finally,the data is displayed through the visual page,including the statistical result information of the data in the dark web,as well as the distribution of dark web nodes and the communication relationship between each other,etc.
Keywords/Search Tags:Dark web, The second-generation onion router, Obtaining domain name, Obtaining dark web data
PDF Full Text Request
Related items