Font Size: a A A

Research On Stability Of Statistical Features In Encrypted Malicious APK Traffic

Posted on:2023-06-04Degree:MasterType:Thesis
Country:ChinaCandidate:X H WangFull Text:PDF
GTID:2558306905499444Subject:Engineering
Abstract/Summary:PDF Full Text Request
Nowadays,smartphones have become popular,but people’s security awareness for mobile applications is still weak,which makes the security problems on mobile terminals more prominent than those on other platforms.In addition,the increasing proportion of encrypted traffic in network traffic has resulted in the payload of traffic no longer being visible to the detector,and cyber criminals can hide their malicious activities by encrypted traffic.The existing encrypted malicious traffic detection schemes rely on artificial intelligence generally and all need to extract statistical features from the packet sizes and the packet capture intervals during malware communication.However,such schemes can only achieve satisfactory accuracy in the short run.Although there have been related studies on such problems,these works only analyze the data from the perspective of the detector,without verifying the attacker’s possible evasive behavior.In other similar security fields,some studies have verified that the modification of some features does not cause damage to malicious functions.This thesis attempts to verify that the statistical features in encrypted malicious APK traffic can be modified during the implementation of malicious behavior,and analyzes the impact of this operation on the accuracy of detection schemes.First of all,from the perspective of the detector,this thesis only uses the packet size sequences and the packet capture interval sequences to characterize the encrypted traffic of malicious APK,and gives the machine learning scheme and the deep learning scheme.In this thesis,it is found that due to the different behavior in the application layer of the positive and negative sample traffic,the one-dimensional distribution and multi-dimensional distribution of the two types of sequences are quite different in the two types of samples.Therefore,manual features are extracted from the two distributions respectively,and the effect of feature extraction is evaluated by random forest and support vector machine.It is found that these two types of features can improve the detection ability of machine learning very well.For the deep learning scheme,this thesis converts the information in the traffic into vector sequences and uses LSTM network to learn the features from packet sequences automatically,and achieves higher detection accuracy than the machine learning schemes using manual features.To modify the statistical features in the traffic of malicious APK and its control end while maintaining the normal implementation of malicious behavior,this thesis designs a general fine-grained statistical feature modification scheme for Linux-based operating systems using dynamic instrumentation technology,which writes the instrumentation logic into a dynamic link library.The export functions of the dynamic link library are invoked or used to replace when a Linux system call is hooked.This scheme records the TCP sockets created in each thread by filtering the parameters of a function and tracks the life cycle of all TCP connections by hooking the establishment and disconnection behavior of them all.The TCP connections owned by each thread are recorded correctly and the statistical features of data sent by multiple threads on a TCP connection is well controlled with the help of self-designed data structures.Given a target program,this scheme can finally modify the sending sizes and sending intervals of its TCP segments according to specific sequences.With the help of the above dynamic instrumentation scheme,this thesis changes the statistical features in the network traffic between the malicious APK and its control end maintaining the successful implementation of malicious behavior,and conducts experiments in different scenarios.In each scenario,different methods are tried to modify the sequences of packet sizes and packet sending intervals.This thesis finds that the accuracy of the original detection schemes will drop significantly after the one-dimensional distribution of the two types of sequences is modified,the accuracy is particularly affected by the changes of the packet size sequences.If the randomness of the two types of sequences is increased further,the original features manually extracted from multi-dimensional distribution can also be invalidated,which reduces the detection accuracy further.In the worst case,all detection schemes almost lose their original detection capability completely,which indicates that the statistical features in encrypted traffic are unstable and may help developers of malicious APK to evade in the real world.
Keywords/Search Tags:Encrypted Traffic, Statistical Feature, Dynamic Instrumentation, Evasion, Malicious APK
PDF Full Text Request
Related items