Font Size: a A A

Research On Code Obfuscation Technology Based On Generative Adversarial Nets

Posted on:2021-11-27Degree:MasterType:Thesis
Country:ChinaCandidate:Y Q DengFull Text:PDF
GTID:2518306107460754Subject:Cyberspace security
Abstract/Summary:PDF Full Text Request
Code obfuscation technology performs a functionally equivalent conversion of computer program code,which makes the code difficult to read and understand.Due to the development of software reverse engineering,people can perform reverse analysis on existing program software,and analyze the principles of program implementation through disassembly,decompilation and dynamic tracking.If the hacker analyzes the key code of the program through the software reverse engineering,it may attack the loopholes in the code.Therefore,in order to ensure code security and prevent code misappropriation,the code needs to be obfuscated.However,traditional code obfuscation methods usually replace elements in the code,such as replacing function names and variable names with meaningless names,or insert some extraneous code in the code that does not affect the execution of the program,or replace the logic in the code with a form that is functionally equivalent but more difficult to understand.Reverse analysts can analyze the meaning of the code through specific obfuscation rules to carry out attacks.In this paper,we propose a code obfuscation model named COMBGAN based on Generative Adversarial Nets through the analysis of the principle of code obfuscation and the study of deep learning models and algorithms.First,we design the Encoder-Decoder model that can learn the logical structure of the code,and integrate it into the generative model of COMBGAN.Second,because conventional Encoder-Decoder model can only learn the context semantics of the sequence,but unable to learn the control flow information of the code,we process the code into an abstract syntax tree.Encoder encodes all abstract syntax tree paths of the code segment into intermediate vector representations,which are then decoded by Decoder into target code.Finally,we construct a discriminative model to determine whether the generated obfuscation code comes from the generative model or a traditional obfuscator,and perform obfuscation check on the code.At the same time,the discriminative model and the generative model are subjected to adversarial training to improve the accuracy of generating obfuscated code by the generative model.The experiments used the Java-small data set in the paper code2 seq published at the ICLR2019 conference,and added obfuscated code to integrate it into the Java-obscure data set.On the Java-obscure dataset,comparison experiments between COMBGAN and existing deep learning models and code obfuscation tools show that COMBGANs indicators are superior to other models.At the same time,the comparison experiments between COMBGAN and code2 seq on the Java-small data set show that the results of the COMBGAN model in performance indicators such as precision,recall and F1 value are improved by 2.71%,0.14%,and 1.11% respectively compared to code2 seq.
Keywords/Search Tags:Code Obfuscation, Software Reverse, Deep Learning, Encoder-Decoder, Generative Adversarial Nets
PDF Full Text Request
Related items