Font Size: a A A

Design And Implementation Of Data Collection And Analysis System For Tor Darknet

Posted on:2022-01-27Degree:MasterType:Thesis
Country:ChinaCandidate:J J BianFull Text:PDF
GTID:2518306488985939Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet and communication technology today,the Internet has brought convenience to people,however,its negative effects have also got more and more attention.Due to the strong concealment of darknet and particularity of transaction methods,a large number of illegal incidents such as the sale of citizen information,drug and provision of hacking tools.Cybercrime on the darknet is more concealed than surface web,which makes it more difficult for law enforcement agencies to obtain evidence and supervise its data.Tor darknet hidden service is one of the most popular darknets at present,and it is a main carrier of all kinds of illegal content.How to obtain as much Tor darknet hidden service data as possible and detect and analyze its content entities and illegal events is an important component of the supervision of darknet content.Therefore,it is of great significance to study the data collection and analysis technology for the Tor darknet.The main content of this paper includes the following aspects:(1)This article combines comprehensive link extraction with active link extraction methods,focusing on actively generating links by deploying server nodes with HSDir flag.In addition,through the deployment of a distributed crawler network,combined with the breadth and depth of crawling strategies to obtain the page content of the collected links,this part of the collected content is used as the data source in the analysis.This paper has obtained 100,000-level Tor darknet address links and page content in a relatively short period of time.(2)The article researched the data analysis technology for Tor darknet,first combined the pre-training model BERT and regular matching technology to identify the information entity of the collected Tor darknet text content,including information entities such as subject,organization,location,mailbox,etc.;then using TextRank automatic summarization technology to summarize the core content of this text,and the information entity together to form an event library;Finally,combining trigger words with text similarity to design an event detection program based on certain events,using word similarity to recommend trigger words,and finally formed an extensible trigger vocabulary and event library.(3)This paper designs and implements a data collection and analysis system for the Tor darknet.The system uses offline calculations and designs a background program to integrate Chinese darknet word cloud,event analysis statistics,darknet related province statistics,and language website proportions.The statistics are displayed on the home page;In order to display a more detailed relationship network,the extracted information entities are associated,analyzed and visually displayed;Finally,the system provides a search function for crawled darknet content and events so as to quickly obtain effective information.To summary,this paper designs and implements a data acquisition and analysis system for Tor darknet.This system can help law enforcement agencies monitor and grasp the Tor darknet events they are concerned about,improve the efficiency of clue investigation,and provide technical support for tracking darknet crimes.
Keywords/Search Tags:Tor darknet, Data collection, Named Entity Recognition, Association Analysis, Event Analysis
PDF Full Text Request
Related items