Font Size: a A A

Analysis And Research Of Darknet Chinese Online Anonymous Market Based On BERT

Posted on:2022-11-19Degree:MasterType:Thesis
Country:ChinaCandidate:C M ZhangFull Text:PDF
GTID:2518306755495994Subject:Computer technology
Abstract/Summary:PDF Full Text Request
At present,the Internet has been deeply integrated with daily life,people's willingness for privacy protection and freedom of speech is growing,and stealth communication systems have begun to come into people's sight.Among them,the Tor network,which is the most used,provides hidden services for the sake of anonymity.,but it has also become the umbrella of the online anonymous market involving illegal commodity transactions.Under the cover of the Tor network,the Chinese online anonymous market has brought serious challenges to my country's national security,and also brought difficulties to related research,mainly reflected in the The Chinese online anonymous market address is hidden and the collection efficiency is not high,the product labeling in the market is missing,and the market public data is limited.Based on the above difficulties,this paper mainly carries out the following work and contributions:(1)In view of the low efficiency of address collection in Chinese online anonymous market,this paper proposes a hidden service filtering strategy based on Bloom filter to improve the collection efficiency of hidden services,including the filtering of discarded low-activity hidden services based on Redis strategy,the mirror hiding service filtering strategy based on cosine similarity and the pornographic service filtering strategy based on page information;and based on this strategy,a Chinese online anonymous market information collection system is designed: the system is based on the Scrapy framework,including three methods.Address collection module for domain name address acquisition,hidden service analysis module based on cosine similarity calculation,Docker-based distributed crawler module and commodity crawler module dedicated to commodity information collection.The results show that under the same conditions,this method saves 34.8% of the time compared to the common Docker-based distributed method.(2)In view of the lack of accurate labeling of commodities in the market,this paper collects data sets covering more than 90,000 commodities in 7 markets,and proposes an automatic labeling method for commodities in Chinese online anonymous markets.This method is mainly based on the BERT pre-training model.The products in the Chinese online anonymous market are divided into six categories.The results show that using TF-IDF algorithm to extract text information keywords from the description of goods,stitching the title of goods is the best text input feature,and the classification accuracy reaches 85.14%under the experimental conditions in this paper.(3)In view of the incomplete public data of the Chinese online anonymous market,this paper proposes a Chinese market estimation method based on the absence of time fields.The missing fields are replaced by error-controllable approximations and combined with the trained product automatic labeling model to finally compare the English market.The situation of the online anonymous market completes the analysis of the Chinese market commodities,sellers,and operating mechanisms,The results show that the Chinese online anonymous market is dominated by data goods,with a total revenue of at least US $20 million from 2018 to 2022.The commercialization of illegal acts is more serious than that mentioned in the report.
Keywords/Search Tags:Tor, Anonymous Market, Hidden Service Collection, Commodity Classification
PDF Full Text Request
Related items