Font Size: a A A

Monitoring And Analysis Of HTTPS Encrypted Webpage Traffic

Posted on:2022-01-03Degree:MasterType:Thesis
Country:ChinaCandidate:P Y KongFull Text:PDF
GTID:2518306740994479Subject:Cyberspace security
Abstract/Summary:PDF Full Text Request
In order to meet user privacy protection and network security,most webpages have been encrypted using HTTPS.However,some criminals spread harmful information on HTTPS webpages,which brings a new challenge to network supervision.Traditional webpage identification methods focus on identifying the homepages of different websites,while lacking of research on identifying different webpages under the same website.Besides,existing methods do not consider the impact of background traffic and mixed traffic of multiple webpages,and can only identify encrypted webpages under simple user behaviors.To solve the above problems,based on the analysis of the characteristics of HTTPS webpage traffic behavior,this thesis proposes an HTTPS webpage identification method.The main contributions of this thesis are as follows:(1)Aiming at the problem that it is difficult to identify the HTTPS request type after encryption,the thesis proposes an HTTPS request type identification method based on TLS record sequence.This method uses TLS records as the basic unit,generates combined features by combining a single TLS record and adjacent records.This method uses random forest model to identify TLS records containing HTTPS requests in an HTTPS flow;then,based on the deep forest model,it mines the internal relationships of the combined features to realize refined classification of TLS records containing requests which is the identification of GET,POST and other requests,thereby providing semantic features for HTTPS webpage identification.Experimental results show that this method has a high recall and precision.(2)Aiming at the problem of high similarity between different webpage traffic of the same website which makes it difficult to accurately identify,the thesis proposes an HTTPS webpage identification method based on multi-flow behavior.This method first extracts the SNI information of the target webpage traffic,and then establishes the SNI fingerprint database to realize the preliminary screening of the target webpage traffic;on this basis,extracts the flow-based temporal and spatial features of the filtered webpage traffic,and proposes and uses the Balance Cascade-Forest model to extract the traffic of a single webpage;finally,the features of the single webpage traffic are extracted based on the associated flow,and the random forest model is used to realize fine-grained webpage identification.The experimental results show that the webpage identification method based on multi-flow behavior can accurately identify the target webpages.(3)Based on the above methods,the thesis designs and implements a prototype system for HTTPS encrypted webpage traffic monitoring and analysis.The thesis first proposes the overall framework of the system,designs and implements the system's webpage traffic extraction module,HTTPS request type identification module,HTTPS webpage identification module and other functional modules.Finally,the thesis uses real network traffic to test and verify the system.The experimental results show that the system can realize the identification of HTTPS encrypted webpage traffic and has good usability.
Keywords/Search Tags:Encrypted Traffic Identification, Machine Learning, Traffic Classification, Webpage Identification, HTTPS
PDF Full Text Request
Related items