Font Size: a A A

Research On Fingerprint Extraction And Identification Of HTTPS Web Traffic

Posted on:2018-04-18Degree:MasterType:Thesis
Country:ChinaCandidate:N KangFull Text:PDF
GTID:2348330533469392Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet communication technology,in order to protect user privacy and data transmission security,encrypted traffic has been increasingly being applied to network traffic.For Web communication,more and more Websites also choose to use the HTTPS protocol for the transmission of Web data.However,many outlaws take advantage of the HTTPS protocol,and they spread harmful information on HTTPS pages.In order to be able to manage HTTPS traffic effectively,it's imperative to find a method to establish fingerprint libraries for the target HTTPS pages,to accomplish the fingerprint identification for unknown HTTPS traffic.This paper established two characteristic values that can be used as fingerprint information for HTTPS pages.After that,we first achieved a Websites fingerprint information collection system,which can build real-time fingerprint libraries for target pages.This system accesses the target HTTPS page automatically,and in the same time it captures the Web traffic by bypass monitor to collect the fingerprint information of the web pages.Then,this paper studied the identification effect of Web page fingerprint identification method based on C4.5 decision tree algorithm for HTTPS page.We carried out experiments on two datasets respectively,which one contains only the target Web page traffic and the other also contains background traffic.After the experiments,we analyzed the results.On the basis of the above experimental results,this paper proposes a Web page fingerprint identification algorithm based on the characteristics of Web page objects.And we implemented a model of this algorithm.Based on the characteristic,which there are a large number of mixed Web traffic in the actual network environment,we studied the identification effect of this algorithm on the identification of single target page,multiple target pages and target pages with background traffic.Finally,we compared and analyzed the two kinds of fingerprint identification methods above,and expounded their advantages and disadvantages and the adaptive network environment for the method itself.As the results showed,the Web page fingerprint identification algorithm based on the characteristics of Web page objects has a high degree of feasibility on the traffic identification of multiple target pages.And the addition of background traffic does not affect the normal operation of this algorithm.
Keywords/Search Tags:Traffic Identification, Fingerprint Identification, HTTPS, Website Fingerprint
PDF Full Text Request
Related items