Font Size: a A A

Research And Implementation Of Large-Scale Mobile Application Traffic Identification Technology

Posted on:2021-05-23Degree:MasterType:Thesis
Country:ChinaCandidate:X F ShiFull Text:PDF
GTID:2518306557487274Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In recent years,the popularity of mobile terminals has continued to increase,and the scale of the mobile application market has continued to expand,resulting in the rapid growth of mobile network traffic.Traffic identification technology for mobile application has become a hot research topic.Existing mobile application traffic identification works mainly focus on automatic collection of traffic data and intelligent analysis.In terms of traffic data collection,dynamic testing technology often has problems of low traffic coverage and incomplete acquisition of traffic data.While static information extraction technology faces the problems of high complexity of the method,large time consumption or too large scale of application database construction which makes it difficult to conduct large-scale analysis.In terms of mobile application traffic analysis,the existing works separately identify plaintext traffic or encrypted traffic.From the perspective of system development,it is necessary to integrate the two parts of the technology.Also,the existing researches only use manual extraction to mine strong identifier in plaintext traffic and thus cannot realize large-scale automatic extraction of strong identifier.In view of the above problems,this thesis aims to balance the coverage of traffic and the complexity of method,designs a traffic information extraction technology that combines dynamic testing and static analysis,and proposes a traffic identification method based on traffic content and traffic statistical characteristics.The specific work mainly includes the following three aspects:Firstly,this thesis proposes a mobile application traffic information acquisition method that combines dynamic traffic information collection method based on function call graph and static traffic information extraction method based on data flow analysis.The former obtains the application traffic generation path by traversing the function call graph of the application,so as to guide the automated test of the application based on the Android emulator to collect the traffic information more accurately.The latter analyzes the application intermediate code and uses methods such as data flow analysis to extract the traffic information in the code.By integrating the above two methods,the problems of low coverage and high complexity in traffic information collection can be solved.Secondly,this thesis proposes a hierarchical plaintext traffic application identification method based on strong identifier and content fingerprint and an encrypted traffic application identification method based on statistical characteristics.For plaintext traffic,this thesis proposes an automatic extraction technology of strong identifier and designs a fingerprint identification method based on Naive Bayes combining traffic content information.This method can achieve the accuracy rate of90% when identifying a single flow.When continuously analyzing 4 flows,the accuracy rate can reach more than 97%.For encrypted traffic,based on the statistical characteristics of traffic and auxiliary information such as DNS domain name and SSL certificate,the identification accuracy rate can reach more than 89%.Thirdly,based on the above research achievements,this thesis designs and implements a mobile application identification system that supports real-time traffic collection and online application identification.In summary,this thesis achieves efficient traffic information acquisition through dynamic collection and static extraction.On this basis,the application identification method based on traffic fingerprint is designed and implemented for plaintext traffic and encrypted traffic.And for plaintext traffic,this thesis designs a large-scale automatic strong identifier mining method,so as to identify traffic more effectively by using it.Finally,an application identification system is implemented to support online identification of source applications of the traffic.
Keywords/Search Tags:application identification, dynamic testing, static analysis, strong identifier, fingerprint identification
PDF Full Text Request
Related items