Font Size: a A A

Research On Tor Anonymous Network Traffic Identification

Posted on:2023-08-31Degree:MasterType:Thesis
Country:ChinaCandidate:S Y GuoFull Text:PDF
GTID:2568307061450224Subject:Cyberspace security
Abstract/Summary:PDF Full Text Request
The anonymous network is able to hide information and relationships between both communicating parties.Among anonymous communication softwares,The Onion Router(Tor)which is based on Onion Routing,is the most develpoed.When clients use Tor to access the network,the multi-hop routing can effectively hide information about both sides of the communication,which has led to its exploitation by criminals to evade regulation.To further enhance anonymity,Tor is often used in conjunction with obfuscation techniques that can disguise or encrypt Tor traffic between clients and Tor anonymous networks,making it difficult to analyze.Therefore,how to effectively identify Tor obfuscated traffic between clients and Tor anonymous networks is a problem that needs to be solved in today’s network regulation.Most of the current research on Tor obfuscated traffic is still based on common traffic features,without characterizing it in the context of specific obfuscation methods,and without considering the current low ratio of Tor traffic in real-world situations.To address these issues,this paper proposes methods to identify Tor obfuscated traffic based on the analysis of the working principles of two common types of Tor obfuscation techniques,which include the following research elements:(1)Proposed an identification method for Tor traffic encrypted by obfs4 protocol.The method is based on the communication process of Tor-obfs4 traffic,analyzing the transferred data packets and selecting the features.In order to better process traffic and obtain features,a new data structure named as Nested Count Bloom Filter based on Bloom Filter is proposed.Finally,an identification model is built using the random forest algorithm and validated using the validation dataset to verify the usability of this model with a small number of Tor-obfs4 traffic.Even with a low ratio of Tor-obfs4 traffic and sampling the traffic,the model achieves 91.3% identification precision rate and 100% recall rate for Tor-obfs4 traffic with an F1 score of 0.955.(2)Proposed an identification method for Tor traffic processed by meek.The method is based on the communication process of Tor-meek traffic and combines the principle of traffic processing by meek to analyze and select the features for Tor-meek traffic.Then a new data structure based on hashtable is proposed,named as Flowprocess Hashtable.This structure is used to obtain the features.Finally,the identification model is established,and by adjusting the ratio of Tor-meek traffic to normal traffic,it is proved that even when the ratio of Tor-meek traffic is very low,the identification recall rate can be 100%.And the precision rate can reach about 95% as the ratio increases.(3)Based on the above approach a prototype system of Tor anonymous network traffic identification is designed and implemented.The paper gives the overall architecture of the system and designs the traffic data processing module and the Tor obfuscated traffic identification module.The traffic detection system is built in a real physical environment to realize the traffic data processing and Tor obfuscated traffic identification,and the identification results are displayed on the front-end page.
Keywords/Search Tags:Tor anonymous network, encrypted traffic identification, traffic processing, machine learning
PDF Full Text Request
Related items