Binary Code Similarity Analysis Techniques For Vulnerability Detection

Posted on:2022-02-05

Degree:Master

Type:Thesis

Country:China

Candidate:L R Cheng

Full Text:PDF

GTID:2518306572451084

Subject:Cyberspace security

Abstract/Summary:

PDF Full Text Request

At present,the number of open source software is increasing.Some open source software packages will be referenced by developers to improve the development efficiency,and the vulnerabilities that may be included in open source software will be repeatedly referenced.Some serious vulnerabilities will cause huge losses,especially after the code is reused many times,the scope of vulnerability will be exponentially expanded,which makes it urgent to use some technical means to detect whether the software contains known vulnerabilities,and minimize the impact of potential holes on the program operation.Vulnerability detection is essentially a similar matching task of code.Most of the existing vulnerability detection methods based on traditional methods rely on manual extraction of vulnerability characteristics to detect vulnerabilities,which has low scalability and high time complexity.Deep learning technology has been widely used in many fields because of its strong learning ability and representation ability,such as natural language processing.Assembly code has many characteristics similar to natural language.It is a way of thinking to solve binary code problem by using the model of natural language processing field.In this paper,we design a vulnerability detection engine based on binary code similarity analysis technology,which adopts the double-layer detection mode of basic block and function,introduces the pre training model in the field of natural language processing to extract the semantic features of basic blocks,and designs two kinds of function semantic and structural feature extraction models for different types of functions,Due to the high time complexity of the function feature extraction model based on graph neural network,this paper finds that only using multi-layer perceptron model to aggregate function information can achieve high accuracy for functions with few basic blocks and simple structure,Therefore,this paper uses simple multi-layer perceptron model and graph neural network-based function semantic and structure feature extraction model to extract function feature vectors for small functions with simple structure and large functions with complex structure,and finally applies similarity measurement methods(such as Euclidean distance,cosine similarity,etc.)to measure the similarity between functions,After setting the threshold,the suspicious vulnerabilities contained in the vulnerability library are identified.In the end,this paper conducts binary code similarity experiments on seven open source software across compiler types and optimization levels,and the results show that the accuracy is improved by about 10% compared with the existing model(asm2vec).Finally,this paper applies the vulnerability retrieval engine to the public vulnerability Library(ESH)to detect the open vulnerabilities of two CVEs,The top10 matching results accurately contain the corresponding vulnerabilities in the vulnerability library.

Keywords/Search Tags:

Binary Code, Vulnerability Detection, Attention mechanism, Graph neural network

PDF Full Text Request

Related items

1	Research On Source Code Vulnerability Detection Method Based On Graph Neural Network
2	Research On Source Code Vulnerability Detection Based On Deep Learning
3	Research On Software Vulnerability Detection Method Based On Code Property Graph And Graph Convolutional Neural Network
4	Research On Binary Similarity Detection Against Code Obfuscation Based On Generative Adversarial Network
5	Research On Similarity Detection Algorithm Of Binary Code Based On Graph Embedding Representation
6	Research On Binary-Code-Oriented Vulnerability Detection
7	Research On Open Source Repository And Graph Neural Network For Vulnerability Detection
8	Build Control Flow Graph In Binary Code Based On Graph Neural Network
9	Research And Implementation Of Security Vulnerability Automatic Detection System On Source Code
10	Research On Code Clone Detection Based On Deep Learning