Font Size: a A A

Research On Similarity Detection Algorithm Of Binary Code Based On Graph Embedding Representation

Posted on:2022-08-20Degree:MasterType:Thesis
Country:ChinaCandidate:D D LiFull Text:PDF
GTID:2518306314968689Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Binary code similarity detection aims to detect whether two binary functions from different platforms,different compilers,and different optimization options are similar.It has many applications in network security and intellectual property protection,such as Io T device vulnerability detection,malware analysis,code plagiarism and other issues.Some of the existing detection methods rely on graph matching algorithms,which have high time complexity and are difficult to adapt to new tasks;the other part is a graph embedding detection method based on neural networks,which uses neural networks to express binary functions as feature embedding vectors.The similarity is detected by numerical calculation between vectors.This type of method does not consider the importance difference of neighbor nodes when generating the feature embedding vector,which reduces the accuracy of feature representation,and uses a fixed distance metric when calculating the vector similarity,resulting in unsatisfactory final detection results.To solve the above problems,two improvement methods are proposed in this paper,which are as follows.In order to consider the difference in importance between neighbor nodes,this paper proposes a graph embedding generation network based on graph attention networks(GAT).Compared with the traditional structure2 vec graph embedding generation network,the graph attention network uses attention The mechanism learns the mutual weight coefficients between different nodes,so that it can learn according to the weight coefficients of neighbors in the process of node feature learning,and finally describe the feature information of the node more accurately.In order to enhance the self-adaptability of distance measurement and perform distance measurement learning,this paper proposes to use multi-layer perception(MLP)neural network to learn the similarity of binary functions..The input of the neural network is the feature fusion of a pair of binary functions,and the relationship score between the two binary functions is obtained after the neural network learning,and the relationship score is used to express the similarity of the two binary functions.Based on the above two improvements,this paper proposes a new binary code similarity detection algorithm Code Sim?new based on graph embedding representation.Experiments show that this new binary code similarity detection algorithm is superior to existing detection algorithms in detection performance and accuracy.
Keywords/Search Tags:code similarity detection, graph embedding generation network, graph attention network, distance measurement learning
PDF Full Text Request
Related items