Font Size: a A A

Research On Binary Similarity Detection Against Code Obfuscation Based On Generative Adversarial Network

Posted on:2021-02-10Degree:MasterType:Thesis
Country:ChinaCandidate:Z Y YangFull Text:PDF
GTID:2518306107450404Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet,software has become an indispensable part of people's life.It not only brings convenience to people,but also causes a lot of security problems.Malicious attackers obtain user's privacy information by implanting malicious behaviors.With the maturity of code obfuscation tools,more and more obfuscation technologies are used by malicious attackers to avoid detection and killing of security defense system,which greatly increases the difficulty and cost of reverse analysis.The anti-obfuscation binary similarity detection model based on generative adversarial network(GAN)can generate a confusion attack bypassing the current detection of anti-obfuscation detection model and improve the robustness and detection rate of detection model through alternative training.First of all,two new binary feature extraction methods are proposed.The original binary file and the confused binary file are processed in reverse.The assembly code is obtained,and the assembly instructions are extracted based on the function granularity to reflect the function and characteristics of the feature vector and function control flow chart feature(ACFG).Secondly,a feature generation model of obfuscation function based on graph antagonism generation network is designed.According to the given function,a new obfuscation function property control flow chart can be generated to bypass the detection model.In addition,a new binary similarity detection algorithm against code obfuscation is proposed.Compared with the traditional detection algorithm and the recently popular neural network detection algorithm,the similarity detection algorithm based on cross graph attention mechanism can extract more abundant structure and semantic information,and it has great advantages in performance,accuracy and scalability.As shown in the experimental results,the proposed model can generally improve the accuracy of the traditional anti-obfuscation similarity detection model,and the optimal detection accuracy of the model can reach 99.4%,96.7% and 96.9% for three commonly used obfuscation strategies: instruction replacement,control flow flattening and false control flow.Compared with the traditional confusion generation algorithm,the new samples generated by Gan network can reduce the detection rate of the current detection model to close to zero,greatly enriching the diversity of confusion features,and making it difficult to work against the case defense method based on retraining.
Keywords/Search Tags:Binary similarity detection, Against code obfuscation, Generative adversarial network, Cross graph attention mechanism
PDF Full Text Request
Related items